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REMARKS 

Status of the Claims 

Claims 1 1 and 13-22 are pending in the instant application. Claims 1-10, 12, and 23 have 
been canceled without prejudice or disclaimer of the subject matter claimed therein. Claims 19- 
22 have been withdrawn from examination as being drawn to a non-elected invention. Claims 1 1 
and 13-18 are currently under examination. 

Applicants thank the Examiner for re-grouping claims 17, 18, and 23 in Group I. 

Should the Examiner find claim 17 allowable, we request the opportunity to amend 
claims 19-22 such that they would be dependent on claim 17 and consequently re-joined as well 
(see MPEP 821.04). 

Amendments to the Claims 

Claims 1 1, 16, and 17 have been amended. 

Support for the amendment to claim 1 1 is found in canceled claim 12. 

Support for the amendment to claim 16 is found in Example 1, wherein it shows that the 
nucleic acid encoding a protein of interest can be integrated into the gene encoding HtrA 
protease. Thus, the nucleic acid can be integrated, but it need not be integrated into the gene 
encoding the HtrA protease. 

Support for the amendment to claim 17 is found in claim 1 1 and in claim 17, itself. 

These amendments do not introduce prohibited new matter. 

Rejection Under 35 U.S.C. S 1 12. Second Paragraph 

Claims 1 1-18 and 23 have been rejected as being indefinite for failing to particularly 
point and distinctly claim the subject matter of the invention. 

Claims 1 1, 16, and 23 have been rejected for reciting "HtrA protease" because it is not 
clear what is meant by the term. As shown on page 1, lines 30 and 31, HtrA protease is a 
housekeeping serine protease that degrades abnormally or incorrectly folded proteins exported 
by the bacteria. Pages 1-3 of the specification provide citations that refer to the protease, 
showing that the protease is well known by its name u HtrA protease." Additionally, the 
specification, on page 8, lines 6-11, provides a definition for HtrA protease. 
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Claims 15 and 18 are rejected for reciting "PrtP protease." As shown on page 3, lines 34- 
39, and on page 4, lines 27-35, the term "PrtP protease" refers to a specific protease. Annex 2 
and Buist et al indicate that PrtP, a PHI type protease, is structurally and functionally distinct 
from HtrA protease. Moreover, Applicants have performed a search to show that PrtP and HtrA 
proteases are distinct proteins. Annex 1 provides the amino acid sequence of HtrA protease of 
Lactococcus lactis. Annex 2 provides the amino acid sequence of PrtP protease of Lactococcus 
lactis. Annex 3 provides a copy of a CDD search performed with the amino acid sequence of 
HtrA of Lactococcus lactis as the query sequence which confirm that PrtP and HtrA proteases 
are structurally distinct. Accordingly, the PrtP protease in claims 15 and 18 correctly refers to an 
additional protease. 

Claim 13 is rejected for reciting various bacterial strains. Page 7, lines 8-12, and page 9, 
lines 18-22 provide a list of various Gram positive bacteria that produce the HtrA protease and 
can be used in the present invention. The preferred embodiment is a Lactobacillus strain. 
However, other Gram positive bacteria express the HtrA protease. The attached result of an 
internet search (Google) indicate that various bacteria express HtrA protease. Additionally, 
annexes 4-7, which contain the results of BLAST searches performed against available 
sequences of several Gram positive bacteria such as Lactobacilli, Lactococci, and Streptococci, 
confirm that various Gram positive bacteria express the HtrA protease. Thus, the bacterial 
strains encompassed by claim 13 could be used to practice the method of claim 11. 

Accordingly, Applicants respectfully request withdrawal of this rejection. 

Rejection Under 35 U.S.C. S 102(b) 

A. Claims 23 and 1 1-15 are rejected under 35 U.S.C. § 102(b) as being anticipated by 
Bayles et al. 

Claim 23 has been canceled and claims 11-15 have been amended to recite a step for 
recovering the protein and that the size of the genome of the bacterial strain is equal to or less 
than 3. 2 Mb. 

Bayles et al disclose mutant Listeria monocytogenes comprising a HtrA gene. However, 
Bayles et al do not disclose a method of using this mutant bacterium to produce an exported 
protein. Claims 1 1-15 are directed to a method of producing a desired protein comprising 
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culturing a Gram positive bacterial strain that expresses the protein and that has a genome of less 
than or equal to 3.2 Mb and recovering the protein exported from the bacterial strain. 
Accordingly, Bayles et al do not anticipate the claimed invention. Applicants respectfully 
request withdrawal of this rejection. 

B. Claims 23 and 1 1-15 are rejected under 35 U.S.C. § 102(b) as being anticipated by 
Buist et al 

Claim 23 has been canceled and claims 11-15 have been amended to recite a step for 
recovering the protein and that the size of the genome of the bacterial strain is equal to or less 
than 3.2 Mb. 

Buist et al disclose a PrtP negative Lactococcus lactis strain. However, Buist et al do 
not disclose a Gram positive bacterial strain that does not express a functional HtrA protease or 
the use of such a strain to produce a desired protein. As discussed above HtrA protease and PrtP 
protease are structurally and functionally distinct proteases. Accordingly, Buist et al do not 
anticipate the claimed invention. Applicants respectfully request withdrawal of this rejection. 

The Examiner notes that the prior art teaches multiple protease-deficient strains of B. 
subtilis. Applicants respectfully point out, unlike the bacterial strains used in the claimed 
invention, B. subtilis has a large genome of about 4.2 Mb that encodes several functional HtrA 
proteases such as YyxA, YkdA, and YvtB/Yirf and numerous other extracellular proteases. 

C. Claims 23 and 1 1-15 are rejected under 35 U.S.C. § 102(b) as being anticipated by 
Smeds et al 

Claim 23 has been canceled and claims 11-15 have been amended to recite a step for 
recovering the protein and that the size of the genome of the bacterial strain is equal to or less 
than 3.2 Mb. 

Smeds et al disclose a strain of mutant Lactobacillus helveticus in which the gusA 
reporter gene was inserted downstream of the htrA promoter. The gusA reporter gene encodes 0- 
glucuronidase. Although culturing the mutant Lactobacillus helveticus induces the gusA mRNA, 
culturing the mutant bacterial strain did not induce the expression of p-glucuronidase, the protein 
of interest (page 6152 , col.l, second full paragraph). Thus, the cited reference does not and 
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could not teach recovering the protein of interest. Accordingly, Smeds et al do not anticipate 
the claimed invention. Applicants respectfully request withdrawal of this rejection. 

Rejection Under 35 U.S.C. § 103(a) 

Claims 16, 17, and 18 are rejected under 35 U.S.C. § 103(a) as being unpatentable over 
Bayles et al or Buist et al as applied to claims 23 and 1 1-15 above, and further in view of any 
one of Dougan et al or Georgiou et al 

Claim 16 is directed to a method of producing a protein of interest using Gram positive 
bacteria with a genome of less than or equal to 3.2 Mb and that does not express a functional 
HtrA protease. Claims 17 and 18 are directed to a Gram positive bacterial strain with a genome 
of less than or equal to 3.2 Mb and that does not express a functional HtrA protease. The small 
size of the genome leaves no room for the presence of other proteases of the HtrA family or for 
the presence of other proteases having a similar function. This ensures that no residual 
proteolytic activity remains after the single HtrA protease has been inactivated. 

The deficiencies of Bayles et al and Buist et al are discussed above. 

Dougan et al teach Gram-negative bacteria having a mutation in the degP gene of the 
HtrA family for expressing a heterologous antigen. It seems that degQ and degS genes of the 
HtrA family are still intact and functional. Thus, the htrA protease gene of this bacteria is still 
functional with respect to the degQ and degS genes. Accordingly, Dougan et al do not teach the 
use of Gram positive bacteria that have a genome of less than or equal to 3.2 Mb and that do not 
express a functional HtrA protease for producing a desired protein. 

Similarly, Georgiou et al disclose the use of mutant Gram negative bacteria that are 
multiply protease deficient for producing proteolytically sensitive polypeptides. Specifically, 
Georgiou et al teach mutant Gram negative bacteria deficient in DegP, OmpT, and/or Protease 
III protease. It appears that since the degQ and degS genes of the HtrA are intact, the htrA gene 
must be functional. Accordingly, Georgiou et al do not teach the use of Gram positive bacteria 
that have a genome of less than or equal to 3.2 Mb and that do not express a functional HtrA 
protease, which bacteria is used for producing a desired protein. 

Applicants respectfully point out that Georgiou et al stated that inactivation of a single 
protease is not sufficient to prevent the degradation of exported polypeptides (col. 2, lines 30- 
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40). Thus, Georgiou et al teach inactivating mutiple proteolytic enzymes. However, they point 
out that there is no assurance that disablement or deletion of any given protease or combination 
of proteases will result in a viable unchanged host cell or that such manipulation will avoid the 
precipitation of toxic events within the cell. Accordingly, although it is possible to create mutant 
organisms having deficiencies in more than three proteases because many Gram negative 
bacteria express at least seven or eight different proteases that degrade secreted polypeptides 
(col. 6, 1st paragraph), Georgiou et al stated that deactivating a large number of proteolytic 
enzymes at some point will compromise the cell's viability (col. 6. 3rd paragraph). 

Accordingly, there would not have been any reasonable expectation of success in 
obtaining the Gram positive bacteria of claims 17 and 18 or the method of using the Gram 
positive bacteria described in claim 16 by combining the cited references. Furthermore, there 
would not have been any motivation to combine the cited references since Bayles et al and Buist 
et al teach Gram positive bacteria while Dougan et al and Georgiou et al teach Gram negative 
bacteria. Gram positive bacteria are different from Gram negative bacteria in many aspects, 
including production of proteolytic enzymes. Applicants respectfully request withdrawal of the 
rejection. 

Conclusion 

The foregoing amendments and remarks are being made to place the application in 
condition for allowance. Applicants respectfully request entry of the amendments, 
reconsideration, and the timely allowance of the pending claims. A favorable action is awaited. 
Should the Examiner find that an interview would be helpful to further prosecution of this 
application, they are invited to telephone the undersigned at their convenience. 

If there are any additional fees due in connection with the filing of this response, please 
charge the fees to our Deposit Account No. 50-03 10. If a fee is required for an extension of time 
under 37 C.F.R. § 1.136 not accounted for above, such an extension is requested and the fee 
should also be charged to our Deposit Account. 
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Respectfully submitted, 
Morgan, Lewis & Bockius LLP 



Date: November 10, 2003 

Morgan, Lewis & Bockius LLP . 

Customer No. 09629 

1111 Pennsylvania Avenue, N.W. 

Washington, D.C. 20004 

Tel: 202-739-3000 

Fax: 202-739-3001 




l-WA/2080104.1 



ANNEX 1 



NiceProt View of Swiss-Prot: Q9LA06 

Entry information 

Entry name HTRA.LACLA 

Primary accession number Q9LA06 

Secondary accession numbers None 

Entered in Swiss-Prot in Release 40, October 2001 

Sequence was last modified in Release 40, October 2001 

Annotations were last modified in Release 41 , February 2003 

Name and origin of the protein 

Protein name Serine protease do-like htrA 

Synonyms EC 3.4.2L- 

HtrALl 

Gene name HTRA or LL2136 

From Lactococcus lactis (subsp. lactis) [TaxID: 

(Streptococcus lactis) 1360] 

Taxonomy Bacteria; Firmicutes; Lactobacillales; Streptococcaceae; 

Lactococcus. 

References 

[1] SEQUENCE FROM NUCLEIC ACID, AND CHARACTERIZATION. 
STRAIN=IL1403; 

MEDLINE=201 77820; PubMed= 107 12686; 
Poquet L, Saint V., Seznec E., Simoes N., Bolotin A., Gruss A.; 
"HtrA is the unique surface housekeeping protease in Lactococcus lactis and is required 
for natural protein processing."; 
Mol. Microbiol. 35:1042-1051(2000). 
[2] SEQUENCE FROM NUCLEIC ACID. 
STRAIN=IL1403; 

MEDLINE-21235186; PubMed=l 1337471; 

Bolotin A., Wincker P., Mauger S., Jaillon O., Malarme K., Weissenbach J., Ehrlich S.D., 
Sorokin A.; 

"The complete genome sequence of the lactic acid bacterium Lactococcus lactis ssp. lactis 
IL1403."; 

Genome Res. 11:731-753(2001). 
Comments 

. FUNCTION. DEGRADES ABNORMAL EXPORTED PROTEINS. NEEDED FOR 
THE PRO-PEPTIDE PROCESSING OF A NATURAL PRO-PROTEIN AND FOR 
MATURATION OF A NATIVE PROTEIN. RESPONSIBLE FOR THE 
HOUSEKEEPING OF EXPORTED PROTEINS. 

• SUBCELL ULAR LOCA TION: Membrane-bound (Probable). 

• SIMILARITY: Belongs to peptidase family S2C. 
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• SIMILARITY: Contains 1 PDZ/DHR domain. 



Copyright 

This SWISS-PROT entry is copyright. It is produced through a collaboration between the Swiss Institute of 
Bioinformatics and the EMBL outstation - the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way modified and this statement is not removed. 
Usage by and for commercial entities requires a license agreement (See http://www.isb-sib.ch/announce/ or send 
an email to license@isb-sib.ch) 



Cross-references 



EMBL 


AF 1 55705; AAF6 1294.1;-. 
AE006442;AAK06234.1;-. 


PIR 


H86891;H86891. 


MEROPS 


S01.273;-. 


InterPro 


IPR009003; Cys_Ser_trypsin. 
IPR001478; PDZ. 
IPR001254; Peptidase_S 1 . 
IPROO 1 940; Peptidase_S 1 C. 


Pfam 


PF00595;PDZ; 1. 
PF00089; trypsin; 1. 


PRINTS 


PR00834; PROTEASES2C. 


SMART 


SM00228; PDZ; 1. 


PROSITE 


PS50106; PDZ; 1. 


Implicit links 
to 


CMR; ProDom; HOBACGEN; BLOCKS; ProtoNet; ProtoMap; PRESAGE; 
DIP; ModBase; SWISS-2DPAGE. 



Keywords 



Hydrolase; Serine protease; Transmembrane; Complete proteome. 
Features 



Key 


From 


TO 


Length 


Description 






TRANSMEM 


6 


26 


21 


POTENTIAL. 






DOMAIN 


88 


284 


197 


CATALYTIC. 






DOMAIN 


302 


383 


82 


PDZ . 






ACT_SITE 


127 


127 




CHARGE RELAY 


SYSTEM 


(POTENTIAL) 


ACT_SITE 


157 


157 




CHARGE RELAY 


SYSTEM 


(POTENTIAL) 


ACT SITE 


239 


239 




CHARGE RELAY 


SYSTEM 


(POTENTIAL) 



Sequence information 

Length: 408 AA 
Molecular weight; 41648 Da 

CRC64: 581B90B55A7DF851 [This is a checksum on the sequence] 
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10 20 30 40 50 60 

I I I I I I 

MAKANIGKLL LTGWGGAIA LGGSAIYQST TNQSANNSRS NTTSTKVSNV SVNVNTDVTS 

70 80 90 100 110 120 

I I I I I I 

AIKKVSNSW SVMNYQKDNS QSSDFSSIFG GNSGSSSSTD GLQLSSEGSG VIYKKSGGDA 

130 140 150 160 170 180 

I I I I I I 

YWTNYHVIA GNSSLDVLLS GGQKVKASW GYDEYTDLAV LKISSEHVKD VAT FADS SKL 

190 200 210 220 230 240 

I I I I I I 

TIGEPAIAVG SPLGSQFANT ATEGILSATS RQVTLTQENG QTTNINAIQT DAAINPGNSG 
250 260 270 280 290 300 

I I I I I I 

GALINIEGQV IGITQSKITT TEDGSTSVEG LGFAIPSNDV VN I INKLE AD GKISRPALGI 
310 320 330 340 350 360 

I I I I I I 

RMVDLSQLST NDSSQLKLPS SVTGGVWYS VQSGLPAASA GLKAGDVITK VGDTAVTSST 

370 380 390 400 

I I I I 

DLQSALYSHN INDTVKVTYY RDGKSNTADV KLSKSTSDLE TSSPSSSN 
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NiceProt View of Swiss-Prot: P15292 



Entry information 

Entry name 

Primary accession number 
Secondary accession numbers 
Entered in Swiss-Prot in 
Sequence was last modified in 



P3P.LACLC 
P15292 

None 

Release 14, April 1990 
Release 14, April 1990 



Annotations were last modified in Release 41, February 2003 



Name and origin of the protein 



Protein name 
Synonyms 

Gene name 
From 

Encoded on 
Taxonomy 



Pill-type proteinase [Precursor] 

EC 3.4.21.96 
Lactocepin 

Cell wall-associated serine proteinase 
PRTP 

Lactococcus lactis (subsp. cremoris) [TaxID: 
(Streptococcus cremoris) 1359] 
Plasmid. 

Bacteria; Firmicutes; Lactobacillales; Streptococcaceae; 
Lactococcus. 



References 

[1] SEQUENCE FROM NUCLEIC ACID, AND SEQUENCE OF 188-197. 
STRAIN=SK11; 

MEDLINE=89340435; PubMed=2760036; 
Vos P., Simons G., Siezen R.J,, de Vos W.M.; 

"Primary structure and organization of the gene for a procaryotic, cell envelope-located 
serine proteinase."; 

J. Biol. Chem. 264:13579-13585(1989). 
Comments 

• FUNCTION: PROTEASE WHICH BREAKS DOWN MILK PROTEINS DURING 
THE GROWTH OF THE BACTERIA ON MILK. 

• CATALYTIC ACTIVITY: Endopeptidase activity with very broad specificity, 
although some subsite preference have been noted, e.g. large hydrophobic residues in 
the PI and P4 positions, and Pro in the P2 position. Best known for its action on 
caseins, although it has been shown to hydrolyze hemoglobin and oxidized insulin B- 
chain. 

• SUBCELL ULAR LOCA 7707V: Attached to the cell wall peptidoglycan by an amide 
bond (Potential). 

• SIMILARITY: Belongs to peptidase family S8. 
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Copyright 

This SWISS-PROT entry is copyright. It is produced through a collaboration between the Swiss Institute of 
Bioinformatics and the EMBL outstation - the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way modified and this statement is not removed. 
Usage by and for commercial entities requires a license agreement (See http://www.isb-sib.ch/announce/ or send 
an email to license@isb-sib.ch) 



Cross-references 



EMBL 


J04962; AAA03533.1; ALT_SEQ. 


HSSP 


P00782- 2SBT 


MEROPS 


S08.019;-. 


InterPro 


ir kuu 1 077, oram_pos_ancnor. 
IPR003137;PA. 
IPR000209; Peptidase_S8. 


Pfam 


PF00746; Gram_pos_anchor; 1. 
PF02225; PA; 1. 
PF00082; Peptidase_S8; 1. 


PRINTS 


PR00723; SUBTILISIN. 


TIGRFAMs 


TIGR01167; LPXTG.anchor; 1. 


PROSITE 


PS50847; GRAM_POS_ANCHORING; 1. 
PS00136; SUBTILASE.ASP; 1. 
PS00137; SUBTILASE.HIS; 1. 
PS00138; SUBTILASE_SER; 1. 


Implicit links 
to 


ProDom; HOBACGEN; BLOCKS; ProtoNet; ProtoMap; PRESAGE; DIP; 
ModBase; SWISS-2DPAGE. 



Keywords 



Hydrolase; Serine protease; Cell wall; Peptidoglycan-anchor; Zymogen; Signal; 
Plasmid. 

Features 



Key 


From 


TO 


Length 


Description 


SIGNAL 


1 


33 


33 




PROPEP 


34 


187 


154 




CHAIN 


188 


1870 


1683 


PIII-TYPE PROTEINASE. 


PROPEP 


1871 


1902 


32 


REMOVED BY SORTASE {POTENTIAL) . 


ACT_SITE 


217 


217 




CHARGE RELAY SYSTEM (BY SIMILARITY) . 


ACT_SITE 


281 


281 




CHARGE RELAY SYSTEM (BY SIMILARITY) . 


ACT_SITE 


620 


620 




CHARGE RELAY SYSTEM (BY SIMILARITY) . 


SITE 


1867 


1871 


5 


LPXTG SORTING SIGNAL (POTENTIAL) . 


MOD RES 


1870 


1870 




AMIDE-LINKED TO CELL WALL (POTENTIAL) 
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Sequence information 

Length: 1902 AA [This is the length of the unprocessed precursor] 
Molecular weight: 200550 Da [This is the MW of the unprocessed precursor] 
CRC64: 87CECBAA9345F9D3 [This is a checksum on the sequence] 

10 20 30 40 50 60 

I I I I I I 

MQRKKKGLSI LLAGTVALGA LAVLPVGEIQ AKAAISQQTK GSSLANTVTA ATAKQAATDT 

70 80 90 100 110 120 

I I I I I I 

TAATTNQAIA TQLAAKGIDY NKLNKVQQQD IYVDVIVQMS AAPASENGIL RTDYSSTAEI 

130 140 150 160 170 180 

I I I I I I 

QQETNKVIAA QASVKAAVEQ VTQQTAGESY GYWNGFSTK VRWDIPKLK QIAGVKTVTL 

190 200 210 220 230 240 

I I I I I I 

AKVYYPTDAK ANSMANVQAV WSNYKYKGEG TWSVIDSGI DPTHKDMRLS DDKDVKLTKS 

250 260 270 280 290 300 

I I I I I I 

DVEKFTDTVK HGRYFNSKVP YGFNYADNND TITDDKVDEQ HGMHVAGIIG ANGTGDDPAK 

310 320 330 340 350 360 

I I I I I I 

SWGVAPEAQ LLAMKVFSNS DTSAKTGSAT WSAIEDSAK IGADVLNMSL GSNSGNQTLE 

370 380 390 400 410 420 

I I I I I I 

DPELAAVQNA NESGTAAVIS AGNSGTSGSA TEGVNKDYYG LQDNEMVGSP GTSRGATTVA 

430 440 450 460 470 480 

I I I I I I 

SAENTDVITQ AVTITDGTGL QLGPETIQLS SHDFTGSFDQ KKFYIVKDAS GNLSKGALAD 

490 500 510 520 530 540 

I I I I I I 

YTADAKGKIA IVKRGEFSFD DKQKYAQAAG AAGLIIVNTD GTATPMTSIA LTTTFPTFGL 

550 560 570 580 590 600 

I I I I I I 

SSVTGQKLVD WVTAHPDDSL GVKITLAMLP NQKYTEDKMS DFTSYGPVSN LSFKPDITAP 

610 620 630 640 650 660 

I I I I I I 

GGNIWSTQNN NGYTNMSGTS MASPFIAGSQ ALLKQALNNK NNPFYAYYKQ LKGTALTDFL 

670 680 690 700 710 720 

I I I I I I 

KTVEMNTAQP INDINYNNVI VSPRRQGAGL VDVKAAIDAL EKNPSTWAE NGYPAVELKD 

730 740 750 760 770 780 

I I I I I I 

FTSTDKTFKL TFTNRTTHEL TYQMDSNTDT NAVYTSATDP NSGVLYDKKI DGAAI KAGSN 

790 800 810 820 830 840 
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ITVPAGKTAQ IEFTLSLPKS FDQQQFVEGF LNFKGSDGSR LNLPYMGFFG DWNDGKIVDS 
850 860 870 880 890 900 

I I I I I I 

LNGITYSPAG GNFGTVPLLK NKNTGTQYYG GMVTDADGNK TVDDQAIAFS SDKNALYNDI 
910 920 930 940 950 960 

I I I I I I 

SMKYYLLRNI SNVQVDILDG QGNKVTTLSS STNRKKTYYN AHSQQYIYYN APAWDGTYYD 

970 980 990 1000 1010 1020 

I I I I I I 

QRDGNIKTAD DGSYTYRISG VPEGGDKRQV FDVPFKLDSK APTVRHVALS AKTENGKTQY 

1030 1040 1050 1060 1070 1080 

I I I I I I 

YLTAEAKDDL SGLDATKSVK TEINEVTNLD ATFTDAGTTA DGYTKIETPL SDEQAQALGN 

1090 1100 1110 1120 1130 1140 

I I I I I I 

GDNSAELYLT DNASNATDQD ASVQKPGSTS FDLIVNGGGI PDKISSTTTG YEANTQGGGT 
1150 1160 1170 1180 1190 1200 

I I I I I I 

YTFSGTYPAA VDGTYTDAQG KKHDLNTTYD AATNSFTASM PVTNADYAAQ VDLYADKAHT 

1210 1220 1230 1240 1250 1260 

I I I I I I 

QLLKHFDTKV RLMAPTFTDL KFNNGSDQTS EATIKVTGTV SADTKTVNVG HTVAALDAQH 

1270 1280 1290 1300 1310 1320 

I I I I I I 

HFSVDVPVNY GDNTIKVTAT DKDGNTTTEQ KTITSSYDPD MLKKSVTFDQ GVKFGTNKFN 

1330 1340 1350 1360 1370 1380 

I I I I I I 

ATSAKFYDPK TGIATITGKV KHPTTTLQVD GKQIPIKDDL TFSFTLDLGT LGQKPFGVW 

1390 1400 1410 1420 1430 1440 

I I I I I I 

GDTTQNKTFQ EALSFILDAV APTLSLDSST DAPVYTNDPN FQITGTATDN AQYLSLSING 
1450 1460 1470 1480 1490 1500 

I I I I I I 

SSVASQYEDI NINSGKPGHM AIDQPVKLLE GKNVLTVAVT DSEDNTTTKN ITVYYEPKKT 

1510 1520 1530 1540 1550 1560 

I I I I I I 

LAAPTVTPST TEPAQTVTLT ANAAATGETV QYSADGGKTY QDVPAAGVTI TANGTFKFKS 

1570 1580 1590 1600 1610 1620 

I I I I I I 

TDLYGNESPA VDYWTNIKA DDPAQLQAAK QELTNLIASA KTLSASGKYD DATTTALAAA 

1630 1640 1650 1660 1670 1680 

I I I I I I 

TQKAQTALDQ TNASVDSLTG ANRDLQTAIN QLAAKLPADK KTSLLNQLQS VKDALGTDLG 

1690 1700 1710 1720 1730 1740 

I I I I I I 
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NQTDPSTGKT FTAALDDLVA QAQAGTQTDD QLQATLAKIL DEVLAKLAEG IKAATPAEVG 

1750 1760 1770 1780 1790 1800 

I I I I I I 

NAKDAATGKT WYADIADTLT SGQASADASD KLAHLQALQS LKTKVAAAVE AAKTVGKGDG 

1810 1820 1830 1840 1850 1860 

I I I I I I 

TTGTSDKGGG QGTPAPAPGD TGKDKGDEGS QPSSGGNIPT KPATTTSTTT DDTTDRNGQL 

1870 1880 1890 1900 

l l l I 

TSGKGALPKT GETTERPAFG FLGVIWSLM GVLGLKRKQR EE 
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5 NCBI 



Cons rved Domain Databas 



Nucleotide 



Taxonomy 



CD: COG0265.1, DegQ , Query added PSSM-ld: 10140 Source: Cog 

Description: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, 
protein turnover, chaperones] 
Taxa: cellular organisms Related: may span multiple domains 

Status: Alignment from source Created: 7-Oct-2002 

Aligned: 135 rows PSSM: 347 columns Representative: Consensus 

Proteins: fClick here for CDART summary of Proteins containing COG02651 

10 20 30 40 50 60 

|. |....*....|....*....|. .|- | 

consensus 1 LLVLAGLDLAVG --LLLIAAIAGG RALTSAA- - 29 

query 1 makanigkllltgwggaialggsaiyqs 29 

gi 15616584 22 IGISAFIGAILGal 1 vLFS VPALSGLgwlpye i dsgapt eETGQLTEap 70 

gi 22001651 47 WFRPLLGGVIGGsla lgiYTFTPLGNHDsqdtakq sssqqqtQSVTATSts 97 

gi 1731364 23 YFLSSLIGVIVGav lmAFIMPYLSNEgldtg aldqQ- -QNNNgr 64 

gi 14194653 6 igkLLLTGWGGaia lggS AI YQSTTNQs aNNSRSNTts 44 

gi 15840667 141 AAAALGTPALAApaphgalagsgkLGVRDVLFGGkvsylalgilvaialviGGIGGVIgr 200 

gi 15902042 14 LLWIVISFFSGal gsFSITQLTQKSs vNNSNNNSti 50 

gi 15675945 10 sLSILLIGFLGGli ailTFNNLYPHSp sKINSGKAtt 46 

gi 16799397 122 YFLTAL I GVI I Gg 1 i I FFVAWDNGDnad ttS- -NSNNkp 158 

70 80 90 100 110 120 



I 



I 



consensus 
query 

gi 15616584 
gi 22001651 
gi 1731364 
gi 14194653 
gi 15840667 
gi 15902042 
gi 15675945 
gi 16799397 



consensus 
query 

gi 15616584 
gi 22001651 
gi 1731364 
gi 14194653 
gi 15840667 
gi 15902042 
gi 15675945 
gi 16799397 



consensus 
query 

gi 15616584 
gi 22001651 
gi 1731364 
gi 14194653 
gi 15840667 
gi 15902042 
gi 15675945 
gi 16799397 



30 
30 



GQR LS FATAVE KVAPAWS I ATGLTAKL R 58 

ttnqsannsrsnttstkvsnvsvnvntdvtsaikkvsnswsvmnyqkdnsqssdf ssif 89 



71 ndietvnYAVn SDVS Q AVE KVS DAW -GIVSMTNGs 105 

98 seskkssSSSsafk sedsSKISDMVEDLSPAIV-GITNLQAQSns 141 

65 esirtvnVSVn NAVTKIVSNMSPAW-GWNIQKSd 99 

45 tkvsnvsVNVn TD VTS AI KKVSNS W - S VMNYQKDN sqssd--fssifG 90 

201 ktaewdAFTtskvtlsttgnaQEPAGRFTKVAAAVADSWTIESV 246 

51 t---qtaYKNe NSTTQAVNKVKDAW-SVITYSANRqns vfG 88 

47 s---nmvFNNt TNTTKAVKAVQNAW-SVINYQDNPssslsnpytklf G 91 



159 



tkvekvsVNTt SDVTKAVDKVQDAW-SVLNYQSSSsl-- 

130 140 150 160 170 

— * — I — * — | — * — | — * — L^^. - - * — I - 



195 



180 



I 



59 SFF-PSDPP- - LRS AEGLGSGF I IS SDG^^-.|YI VTNHHV^I AGAEE I TVTL 102 

90 ggnsgsssstdglqlssEGSGVIYK---KSGgte^ 13 9 

106 -mFsSSEEE EGTGSGVIYKkegDRA-".FIw|EH^ISGANQVev 146 

142 s 1 FgS S S SDs s EDTE SGSGSGVI FKkenGKA- YI f JNNHVjVEGAS SLkv 
100 -iWgESGE AGSGSGVI YKkndHS A - ir YVYMw^. EGAS Q I e i 

91 GNSgSSSSTdgLQ-LSSEGSGVIYK^sgGD^||^T^H^IAGNSSLdv 
247 SDQE GMQGSGVIVD- - - GRplg&IOTNNH\^I S EAANN - - - Psqf kttv 286 

89 Nd- -DTDTDsqR- - ISSEGSGVIYKkndKEA-|vY^ 132 

92 EGRsKENKDaeLS-IFSEGSGVIYRkdgNS^-r-YWTNNHyilDGAKRIei 



189 
139 
137 



196 dgTtSSEKE- 
190 



- AS S GS GV I YKkanGKAf -;;r Y IVTNNHVVAD ANKL e v - 



138 
237 



200 



210 
...|. 



220 



230 



240 



I 



I 



103 --ADGREVPAKLVGKDPISpEA|LKIDGAGGLPVIALGDSDKLRVGDVWAIGNPFG--L 158 

140 - -SGGQKVKASWGYDEYTDLA^LKISSEHVKDVATFADSSKLTIGEPAIAVGSPLGsqF 197 

14 7 vlTDGSRLPAEVLGSDVFTjDlJ|LEIDGSDVETVAEFGNSDLLSPGEPAIAIGNPLGlrF 206 

190 S 1 YDGTEVTAKLVGSDSLTDL^LQI SDDHVTKVANFGDSSDLRTGETVI AIGDPLGkdL 249 

140 slKDGSRVSADLVGSDQLMDLAVjLRVKSDKIKAVADFGNSDKVKSGEPVIAIGNPLGleF 199 

138 llSGGQKVKASWGYDEYTDLA^LKISSEHVKDVATFADSSKLTIGEPAIAVGSPLGsqF 197 
28 7 vf NDGKEVPANLVGRDPKTDLA^LKVDNVDNLTVARLGDS SKVRVGDEVLAVGAPLG - - L 344 
133 rlSDGTKVPGEIVGADTFSDIA|VKISSEKVTTVAEFGDSSKLTVGETAIAIGSPLGseY 192 

139 lmADGSKWGELVGADTYSjbliA|vKISSDKIKTVAEFADSTKLNVGEVAIAIGSPLGtqY 198 
238 tfTNGKKSEAKLLGTDEWNDLAWLEIDDKNVSTVATFGDSDSLKLGEPAIAIGSPLGteF 297 
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consensus 
query 

qi 15616584 
qi 22001651 
qi 1731364 
qi 14194653 
qi 15840667 
qi 15902042 
qi 15675945 
qi 16799397 



consensus 
query 

qi 15616584 
qi 22001651 
qi 1731364 
qi 14194653 
qi 15840667 
qi 15902042 
qi 15675945 
qi 16799397 



consensus 
query 

qi 15616584 
qi 22001651 
gi 1731364 
gi 14194653 
gi 15840667 
gi 15902042 
qi 15675945 
gi 16799397 



consensus 
query 

gi 15616584 
qi 22001651 
qi 1731364 
gi 14194653 
qi 15840667 
gi 15902042 
gi 15675945 
gi 16799397 



250 



260 



270 



I 



280 
I 



290 



300 



159 GQTVTSG I VS ALGR T-GVGSAGG YVNF I QTDAAI N PjGNSGGPLVN I DGE W| 208 

198 ANTATEGILSATSRqvtlT-QENGQTT NINAIQTDAAINPj^SGGAl§NIEGQVIj 251 

207 SSSVTLGIISATER S i PI DLTGNgqidwQAEVLQTDAAINPj^NSGGAf VNIQGQVIj 262 

250 SRTVTQGIVSGVDR TVSMSTSAGe tSINVIQTDAAINPGNSGGPBIiNTDGKI^ 302 

■ - -AiPVDSNGDgqpdwNAEVLQTDAAINPGNSGG^ 255 

• - - QvTLTQENGqt - - tNINAIQTDAAINPjJwSGGALINI EGQVIj 251 
•--PvPLSGEGSdt-dtVIDAIQTDASINHpSG^LIDME&QV^ 399 

• - -NvSLKSEDGqa- - iSTKAIQTDTAINP^SG^SriOQQVIj 24 6 
— TvTLKNENGet- -vSTNAIQTDAAINP&SGGPLINIEGQVIl 252 

• - - Av PVDTNGDq t e dwEAD V I QTDAA I NPQNSGGAllN I EGQVIj 353 



303 GIN-- 
256 blN-- 

252 jSIIJ-- 
400 piN-- 
24 7 GITp- 

253 GI*I-- 
354 GIN-- 



257 



200 AGSVTQGVISGTER- 

198 ANTATEGILSATSR- 
345 RSTVTQGIVSALHR- 
193 ANTVTQGIVSSLNR- 

199 ANSVTQGIVSSLSR- 
298 SGSVTQGI ISGLNR- 

310 320 330 340 350 360 

....*....|....*....|....*..-.|... |....*-...|.. .-*... -I 

209 §18 TAIIAPSGG S SGI GFAI PVNLVAPVLDEL I S KGKWRGYLGVIGE 256 

252 bilqskiTTTEDGSTS VEGLGFAIPSNDWNIINKLEADGKISRPALGIRMV 303 

263 piN SMKIAQSt VEG I GFAI PSNLAI P VI EDLE FYGDVQRPQMGVAFR 3 09 

-SMKISEDd VEGIGFAIPSNDVKPIAEELLSKGQIERPYIGVSML 349 

-SMKIAESa VEGIGLSI PSKLVI PVIEDLERYGKVKRPFLGIEMK 302 

- QS KI TTTEdgs 1 sVEGLGFAI PSNDWNI INKLEADGKI SRPALGI RMV 3 03 

-TAGKSLsd saSGLGFAIPVNEMKLVANSLIKDGKIVHPTLGISTR 447 

-SSKIATngg 1 SVEGLGFAI PANDAINIIEQLEKNGKVTRPALGIQMV 296 

-SSKISSTPTgsngnsgaVEGIGFAIPSTDVIKIIKQLETNGEVIRPALGISMV 308 

-SMKISMEn VEGISFAIPSNTVEPIIEQLETKGEVERPSLGVSLR 400 

370 380 390 400 410 420 

|. |.. ..*.... |....*....|....*....|. | 

PLTADIA LGLP- - - VAAGAWLGVLPGS PAAKAG I KAGD 1 1 TAVNGK 300 

3 04 DLSQLST NDSS QLKLPSSVTGGVWysvqsglPAASAGLKAGDVITKVGDT 354 

310 SLSEI PSf hweetLKLPe -dVKGGWITDI VPMS PAETAGLRQYDVIVELNGE 361 

350 DLEQVPQnyqegtLGLFgsqLNKGVYIREVASGS P AE KAGL KAED I I I GL KGK 4 02 

303 SLSDIASyhwdetLKLPk-nVTNGAWMGVDAFS PAGKAGLKELDVI TEFDGY 354 

304 DLSQLSTnds - sqLKLPs - sVTGGVWYSVQSGL PAAS AGL KAGDVI T KVGDT 354 

448 _-S--VSn alASGAQVANVKAGS PAQKGGILENDVIVKVGNR 485 

297 NLSNVSTsdi-rrLNIPs-nVTSGVIVRSVQSNM PAN-GHLEKYDVITKVDDK 346 

309 NLNDLSTnal-sqINIPt-sVTGGIWAEVKEGM PAS - GKLAQ YDVI TE I DGK 358 

401 DVDTIPEtqqkniLKLPd-sVDYGAMVQQWSGS AADKAGLKQYDVIVELNGE 452 

430 440 450 460 

|. |.. |. |. ...*.... 

301 PVASLSDLVAAVASNR- - PGDEVALKLLRGGKERELAVTLGDrSPLSAS 347 
355 AVTSSTDLQSALYSHN- - INDTVKVTYYRDGKSNTADVklskstsdlet 401 

362 D INDGHELRKFL YTELn - 1 GDEVEVTY YREGKKETTTLTL VEQQS S 406 

403 E I DTGS ELRNI L YKDAk - 1 GDTVEVKI LRNGKEMTKKI KLDQ - KEEKTS 449 
355 KVNDIVDLRKRLYQKK- -VGDRVKVKFYRGGKEKSVDIKLSS-ADQLGS 400 
355 AVTSSTDLQSALYSHN- -INDTVKVTYYRDGKSNTADVKLSK-STSDLE 400 

486 AVADSDEFWAVRQLA- - IGQDAPIEWREGRHVTLTVKPDP-DST 528 

347 EIASSTDLQSALYNHS- -IGDTIKITYYRNGKEETTSIKLNK-SSGDLE 392 
359 TVNSISDLQSSLYGHD- -INDTIKVTFYRGTTKKKADIKLTK-TTQDLT 404 
453 KVTNSMTLRKI L YGNDvk I GDKVKVKY YRDGKEQS TD I KLEA - AKTTT - 499 
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ANNEX 4 

BLASTP 2.2.6 [Apr-09-2003) 

RID: 1067437801-2 92 -103 0821. BLASTQ3 

Query= gi | 15674118 | ref |NP_268293 . 1 | exported serine protease 
[Lactococcus lactis subsp. lactis] 
(408 letters) 

Database: Unfinished Lactobacillus gasseri; Completed Lactobacillus plantarum 
WCFS1 ; 

Completed Lactococcus lactis subsp. lactis ; 

Unfinished Leuconostoc mesenteroides subsp. mesenteroides ATCC 8293; 

Unfinished Oenococcus oeni MCW; 

Completed Streptococcus agalactiae 2603V/R ; 

Completed Streptococcus agalactiae NEM316 ; 

15,229 sequences; 4,501,851 total letters 

Taxonomy reports 



Sequences producing significant alignments: 


Score 
(bits) 


E 

Value 


ref 


NP 


268293 .1 




exported serine protease [Lactococcus lact... 


576 


e-165 


ref 


NP 


689159.1 




serine protease [Streptococcus agalactiae . . . 


303 


5e-83 




NP 


783901.1 




serine protease HtrA [Lactobacillus planta. . . 


283 


5e-77 


ref 


ZP 


00063134 


1 


COG0265 


Trypsin-like serine proteases, 


281 


2e-76 


ref 


ZP 


00069121 


1 


COG0265 


Trypsin-like serine proteases, . . . 


272 


7e-74 


ref 


ZP 


00046803 


1 


COG0265 


Trypsin-like serine proteases, ... 


230 


3e-61 


ref 


ZP 


00070364 


1 


COG0265 


Trypsin-like serine proteases, . . . 


181 


2e-46 


ref 


ZP 


00064063 


1 


COG0265 


Trypsin-like serine proteases, ... 


180 


4e-46 


ref 


ZP 


00070156 


1 


COG0750 


Predicted membrane-associated Z. . . 


45 


2e-05 


ref 


NP 


266705.1 




UDP-N-acetylglucosamine 1-carboxyvinyltran. . . 


35 


0.034 


ref 


ZP 


00046513 




COG2 996 


Uncharacterized protein conserv. . . 


33 


0.097 


ref 


ZP 


00063264 


1 


COG0750 


Predicted membrane-associated Z. . . 


33 


0.13 


ref 


NP 


785411.1 




carboxy- terminal processing proteinase [La. . . 


33 


0.13 


ref 


NP 


786668.1 




extracellular protein [Lactobacillus plant... 


32 


0.28 


ref 


NP 


268285.1 




hypothetical protein [Lactococcus lactis s... 


31 


0.48 


ref 


NP 


267651.1 




sugar ABC transporter substrate binding pr. . . 


30 


1.1 


ref 


NP 


687067.1 




peptidase 


M23/M37 family [Streptococcus a... 


30 


1.1 
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ANNEX 4 



cell surface SD repeat protein precursor [. 
extracellular protein, gamma -D- glut amate-m. 
alcohol dehydrogenase, propanol -preferring, 
hypothetical protein [Lactococcus lactis s. 
Unknown [Streptococcus agalactiae NEM316] 
COG1364: N-acetylglutamate synthase (N-a. 
COG1674: DNA segregation ATPase FtsK/Spo. 
Unknown [Streptococcus agalactiae NEM316] 
acetyltransf erase (putative) [Lactobacillu. 
COG1477: Membrane-associated lipoprotein. 
COG1668: ABC-type Na+ efflux pump, perme. 
endopeptidase La (putative) [Lactobacillus. 
COG3 051: Citrate lyase, alpha subunit [0. 
COG1364: N-acetylglutamate synthase (N-a. 
membrane -associated zinc metalloprotease, . 
hypothetical protein [Lactococcus lactis s. 
COG0507: ATP-dependent exoDNAse (exonucl. 
COG3480: Predicted secreted protein cont. 
COG0827: Adenine-specif ic DNA methylase . 
hypothetical protein [Leuconostoc mesent. 
major facilitator family protein [Streptoc. 
conserved hypothetical protein [Streptococ. 
sensor histidine kinase, putative [Strepto. 
Unknown [Streptococcus agalactiae NEM316] 
extracellular protein [Lactobacillus plant. 
COG4653: Predicted phage phi-C31 gp36 ma. 
COG2931: RTX toxins and related Ca2+-bin. 
COG3210: Large exoproteins involved in h. 
COG4932: Predicted outer membrane protei . 
exonuclease RexA [Streptococcus agalactiae. 
3-oxoacyl- (acyl-carrier-protein) synthase . 

Alignments 

> ref |NP 268293. l| exported serine protease [Lactococcus lactis subsp. lactis] 
Length = 4 08 



ref 


NP 784951.1 




ref 


NP 786644.1 




ref 


NP 687090.1 




ref 


NP 267008.1 




ref 


NP 734524.1 




ref 


ZP 00064050. l| 


ref 


ZP 00063238. l| 


ref 


NP 735868. l| 


ref 


NP 784552.1) 


ref 


ZP 00070200. l| 


ref 


ZP 00063415. l| 


ref 


NP 785643. l| 


ref 


ZP 00069981. l| 


ref 


ZP 00064376. 1 | 


ref 


NP 688903.1 




ref 


NP 268318.1 




ref 


ZP 00046283.1 


ref 


ZP 00069420.1 


ref 


ZP 00063200.1 


ref 


ZP 00062802.1 


ref 


NP 687818.1 




ref 


NP 688359.1 




ref 


NP 688028.1 




ref 


NP 735272.1 




ref 


NP 786635.1 




ref 


ZP 00046678.1 


ref 


ZP 00046947.1 


ref 


ZP 00046780.1 


ref 


ZP 00062638.1 


ref 


NP 687888. l| 


ref 


NP 687383. l| 



1 Q 

Z y 


1 


A 
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1 
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4 


zy 


1 
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Q 
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Q 
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Zo 
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1 


Zo 


"5 
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28 


4 


1 


zo 


4 


X 


Z o 


A 

4 


1 
1 


Z a 


A 

4 


1 


28 


4 


1 


2 7 


r 

b 


"3 
O 


2 7 


c 
D 


*J 
J 


27 


c 

D 


1 


Z i 


n 

1 


U 


2 7 


n 

I 


0 


27 


1 


□ 


Z 1 


n 
I 


U 


Z I 


n 

1 


A 

u 


27 


7 


0 


27 


7 


0 


27 


7 


0 


27 


7 


0 


27 


7 


0 


27 


9 


1 


27 


9 


1 


27 


9 


1 



Score = 576 bits (1484), Expect = e-165 

Identities = 310/390 (79%), Positives = 310/390 (79%) 

Query: 1 MAKANIGKLLLTGWGGAIALGGSAIYQXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 60 

MAKAN I GKLLLTG WGGAI ALGGS AI YQ 
Sbjct: 1 MAKANIGKLLLTGWGGAIALGGSAIYQSTTNQSANNSRSNTTSTKVSNVSVlTOn'DVTS 60 

Query: 61 AIKOXXXXXXXMNYQKDNSQXXXXXXXXXXXXXXXXXXXXXXXXXEGSGVIYKKSGGL^ 120 

AIKK MNYQKDNSQ EGSGVIYKKSGGDi 

Sbjct: 61 AIKKVSNSWSVMNYQKDNSQSSDFSSIFGGNSGSSSSTDGLQLSSEGSGVIYKKSGGE^ 120 

Query: 121 SSp^ 180 

Sl^pN^Sl AGNS S LDVLLSGGQKVKAS WGYDE YtKaMIkI S SEHVKDVATFADS SKL 
Sbjct: 121 H§^i^p|l AGNS S LDVLLSGGQKVKAS WGYDE YTD^J^KI S SEHVKDVAT FADS SKL 180 

Query: 181 TIGEPAIAVGSPLGSQFANTATEGILSATSRQVTLTQENGQTTNINAIQTDAAINPgg 24 0 

tigepaiavgsplgsqfantategilsatsrqvtltqengqttninaiqtdaainp;gns| 

Sbjct: 181 TIGEPAIAVGSPLGSQFANTATEGILSATSRQVTLTQENGQTTNINAIQTDAAINP|||g. 240 

Query: 241 CMilNIE^ 300 

Sbjct: 241 i^i^GQ^GllQSKITTTEDGS 300 

Query: 3 01 RMVDLSQLSTNDSSQLKLPXXXXXXXXXXXXXXXLPAASAGLKAGDVITKVGDTAVTSST 36 0 
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RMVDLSQLSTNDSSQLKLP LP AAS AGLKAGD VI TKVGDTAVTS S T 

Sbjct: 301 RMVDLSQLSTNDSSQLKLPSSVTGGVWYSVQSGLPAASAGLKAGDVITKVGDTAVTSST 360 

Query: 361 DLQSALYSHNINDTVKVTYYRDGKSNTADV 3 90 

DLQSALYSHNINDTVKVTYYRDGKSNTADV 
Sbjct: 361 DLQSALYSHNINDTVKVTYYRDGKSNTADV 390 

> ref|NP 689159. l[ serine protease [Streptococcus agalactiae 2603V/R] 
ref[NP 736563. l| Unknown [Streptococcus agalactiae NEM316] 
Length = 409 

Score = 303 bits (775), Expect = 5e-83 

Identities = 160/290 (55%), Positives = 201/290 (69%), Gaps = 7/290 (2%) 

Query: 107 EGSGVIYKKSGGD^^SS^IAGNSSLDVLLSGGQKVKASWGYDEYTD^^LKISSE 166 

EGSGVIYKK G +l^p$IISi|l G +++ L+ G K +VG D Y+DLAV+KI S + 
Sbjct: 105 EGSGVIYKKDGKN|§gf^^ 164 

Query: 167 HVKDVATFADSSKLTIGEPAIAVGSPLGSQFANTATEGILSATSRQVTLTQENGQTTNIN 226 

V ++A FADSSKL IGE AIA+GSPLG+++AN+ T+GI+S+ R VT+T E GQT + N 
Sbjct: 165 KVSNIAEFADSSKLNIGETAIAIGSPLGTEYANSVTQGIVSSLKRTVTMTNEEGQTVSTN 224 

Query:- 227 AIQTDAAINPpl^^^P^p^|QSKITTT- EDGSTSVEGLGFAI PSNDV 280 

AIQTDAAINPllsGGl^NSi^iGil SKI++T + SVEG+GFAI PSNDV 

Sbjct: 225 AIQTDAAINPj^S<SG|^ 284 



Query: 281 VNIINKLEADGKISRPALGIRMVDLSQLSTNDSSQLKLPXXXXXXXXXXXXXXXLPAASA 340 

V IIN+LE++G++ RPALGI M LS L ++ S+LK+P +P A 

Sbjct: 285 VKI I NQLESNGQVERPALG I S MAGLSNLP S D V I S KLKI P SNVTNGI WAS I QS GMP - AQG 343 

Query: 341 GLKAGDVITKVGDTAVTSSTDLQSALYSHNINDTVKVTYYRDGKSNTADV 390 

LK DVITKV D V S +DLQS LY H + D++ VT+YR T + 

Sbjct: 344 KLKKYDVITKVDDKEWSPSDLQSLLYGHQVGDSITVTFYRGENKQTVTI 393 

> ref|NP 783901. l[ serine protease HtrA [Lactobacillus plantarum WCFS1] 
Length = 420 

Score = 283 bits (723), Expect = 5e-77 

Identities = 162/394 (41%), Positives = 214/394 (54%), Gaps = 14/394 (3%) 

Query: 10 LLTGWGGAIALGGSAIYQXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXAIKKX 65 

L+ G++GG +A GG +Q +. 
Sbjct: 14 LVAGLIGGGVAYGGINYFQNNNIATSSTSVPTGSNKSGSTSTTNVKVNVSSQATKVFENN 73 

Query: 66 XXXXXXXMNYQKDNSQXXXX XXXXXXXXXXXXXXXXXXXXXEGSGVIYKKSGG 118 

+N QK +S EGSG+IYKKSG 
Sbjct: 74 KAAWSVINLQKKSSSSSWSGILGGDDSSGSDSSSSSDSSSSKLEEYSEGSGLIYKKSGD 133. 

Ouerv 119 DAYWTNY® 178 

AY+M*N?I$TV;++G+S++ V++S G K+ A +VG D TDEAVLKI+S V A+F +S 
Sbjct: 134 A§ff||^ 193 

Query: 179 KLTIGEPAIAVGSPLGSQFANTATEGILSATSRQVTLTQENGQTTN-INAIQTDAAINPg 237 

+ +GE A+A+GSP+GS +A T T+GI+SA R V T +GQTT IQTD AIN G 

Sbjct: 194 NIKVGETALAIGSPMGSNYATTLTQGIISAKKRTVATTNTSGQTTGYATVIQTDTAINS^ 253 



Query: 238 NSj^^lli^¥^QSKITTTEDGSTSVEGLGFAIPSNDv^IINKLEADGKISRPA 297 

N|^|Si|l^^ilf| K+ + G TSVEG+GFAIPSN+W IIN+L G++ RPA 
Sbjct: 254 NI3|§S^ 312 
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Query: 298 LGIRMVDLSQLSTND-SSQLKLPXXXXXXXXXXXXXXXLPAASAGLKAGDVITKVGDTAV 356 

LG+ DLS +S++D S LKLP PA +AGL DVIT++G V 

Sbjct: 313 LGVATYDLSNISSSDQKSVLKLPTSVTKGWIMKTYSGSPAKAAGLTKYDVITELGGKKV 372 

Query: 357 TS STDLQS ALYS HN INDTVKVT YYRDGKSNTADV 390 

TS L+SALY+H++NDTV V YY +GK TA+ + 
Sbjct: 373 TSLATLRSALYAHSVNDTVTVKYYHNGKLKTANM 406 

> ref |ZP 00063134. 1 | COG0265: Trypsin-like serine proteases, typically 
periplasmic, 

contain C- terminal PDZ domain [Leuconostoc mesenteroides 
subsp. mesenteroides ATCC 8293] 
Length = 379 

Score = 281 bits (718), Expect = 2e-76 

Identities = 160/392 (40%), Positives = 214/392 (54%), Gaps = 23/392 (5%) 

Query: 1 MAKANIGKLLLTGWGGAIALGGSAIYQXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 60 

M + + K LLTGV+ G + GG+ +Y 
Sbjct: 1 MVQPALTKTLLTGVIAGWG-GGAILYGQQGVQLLQNQNQKVSTTATSTKTIAKNATATS 59 



Query: 61 AIKKXXXXXXXXMNYQKDNSQXXXXXXXXXXXXXXXXXXXXXXXXXEGSGVIYKKSGGDi 12 0 

A K +N+ K + EGSGVIYKK+ G | 

Sbjct: 60 AYNKVSDAWSVLNFTKSSQ GSYQESSEGSGVIYKKTDGSA 100 

Query: 121 §$5^^ S S EHVKD VAT FADS S KL 180 

G + + V+L G+KV A++VG D TDIi&KI V A F DSSK+ 
Sbjct: 101 fel^^^ITGAAKIQVMLHSGKKVTATLVGKDAMTg^^KIDGTDVTTTAQFGDSSKI 160 

Query: 181 TIGEPAIAVGSPLGSQFANTATEGILSATSRQVTLTQENGQT-TNINAIQTDAAINPp|g 239 

T+GE +A+GSPLGS++A++ T+GI+SA R V T ENGQ IQTDAAINP||| 
Sbjct: 161 TVGENVLAIGSPLGSEYASSVTQGIISAKKRLVEATSENGQNYGGSTVIQTDAAINPlll 220 



Query: 240 GG^^lGfSGi^S KI TTTEDGSTSVEGLGFAI PSNDWNI INKLEADGKI SRPALG 299 

K++T+ G TSVEG+GFAIPS+ W+I+NKL DGK++RPA+G 
Sbjct: 221 ^P^^^3ii^^^5MKLSTSSSG-TSVEGMGFAIPSDQVVDIVNKLVKDGKVTRPAIG 279 



Query: 300 IRMVDLSQLSTND-SSQLKLPXXXXXXXXXXXXXXXLPAASAGLKAGDVITKVGDTAVTS 358 

I +++LS+++ ++ S LK+P PA AGLK DVI + V+S 

Sbjct: 280 ISLINLSEVTASEQKSTLKIPDSVTGGVWMSLTNNGPADKAGLKKYDVIVGINGKKVSS 339 

Query: 359 STDLQSALYSHNINDTVKVT YYRDGKSNTADV 390 

DL+ LY H++ DT+ +TYY T V 

Sbjct: 34 0 QADLREELYKHS LGDTI TLTY YHQDTKQTVKV 371 

> ref |ZP -00069121.1 1 COG0265: Trypsin-like serine proteases, typically 
periplasmic, 

contain C- terminal PDZ domain [Oenococcus oeni MCW] 
Length = 425 

Score = 272 bits (696), Expect = 7e-74 

Identities = 160/390 (41%), Positives - 210/390 (53%), Gaps = 6/390 (1%) 

Query: 6 IGKLLLTGVVGGAIALGGSAIY-QXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXAIKK 64 

I LL G++GG +A+G IY Q 
Sbjct: 29 I ATALLAGLLGGGVAVGAG YI YTQTTD FIGKS TGALS DGKTT I KAPT I S GKSNATKVYNN 88 

Query: 65 XXXXXXXXMNYQKDNSQXXXXXXXXXXXXXXXXXXXXXXXXXEGSGVIYKKSGGDA^T 124 
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+N Q +S EGSGVIYK + G SgggE 

Sbjct: 89 LKGAWSVINQQATSSSSTIYGDSSKKSSSSTSSFSTLQTASEGSGVIYKDADGYjftY:;!^ 148 



Query: 125 IppjlAGNSSLDVLLSGGQKVKASW^ 184 

N^Htfl+G + V+L GG KV A VG D T&S^L+IS VK VA F +S+ + + G+ 
Sbjct: 14 9 wfi^I SGAKRIQWLYGGTKVVAKKVGSDAMtS^^LRI SGSDVKTVAQFGNSNQI KTGQ 20 8 

Query: 185 PAIAVGSPLGSQFANTATEGILSATSRQVTLTQENGQTT--NINAIQTDAAINPpN|(|?A 242 

+A+GSPLG+ +A++ TEGI+SA+ R V+ T E+G+T + AIQTDAAINP|n||g^ 
Sbjct: 209 TVLAIGSPLGTDYASSVTEGIISASKRLVSNTSESGKTNYGDSIAIQTDAAINPgNS§|2 268 

Query: 243 SNjE^SQilOQS KITTTEDGSTSVEGLGFAI PSNDWNI INKLEADGKI SRPALGI RM 302 

KfN^|vlGl3j K+T T++G SVEG+GFAIPSN W+IINKL GK+ RPALG+ + 
Sbjct: 269 ||n||^v|g|nSQKLTETDEGE-SVEGMGFAI 327 

Query: 303 VDLSQLSTN-DSSQLIOiPX-XXXXXXXXXXXXXXLPAASAGLKAGDVITKVGDTAVTSST 360 

VDLS ++S++ LKLP PA AG+K DVI V V++ 

Sbjct: 328 VDLSEVSSDWKKTLKLPSKVKTGIVIAGFSSDKSPAKKAGIKKYDVIVAVNGEKVSNLA 387 

Query: 361 DLQSALYSHNINDTVKVTYYRDGKSNTADV 390 

D+ + +Y + DTVK+TYYR T V 

Sbjct: 388 DMRDI IYKLKVGDTVKITYYRASTEKTVKV 417 

> ref|ZP 00046803.1 | COG02 65: Trypsin- like serine proteases, typically 
periplasmic, 

contain C-terminal PDZ domain [Lactobacillus gasseri] 
Length = 666 

Score = 230 bits (587), Expect = 3e-61 

Identities = 123/243 (50%), Positives =162/243 (66%), Gaps = 2/243 (0%) 

Query: 107 EGSGVIYKKSGGD|^#1^IAGNSSLDVLLSGGQKVKASWGYDEYT|^|LKISSE 166 

EGSGVIY KS G ^||^pl|^++G+ + V+LS G+KV A VG D T|0|l I + 
Sbjct: 133 EGSGVIYMKSNG^m^^^VSGSDEIQVILSNGKKVTAKKVGTDSET^^LTIDGK 192 

Query: 167 HVKDVATFADS S KLT I GEPAI AVGS PLGSQFANTATEGI LS ATSRQVTLTQENGQTTN - 1 225 

+V A F S L G+ IAVGSPLGS++A + T+GI+SA +R V +T GQ TN 
Sbjct: 193 YVTQTAQFGSS KNLE PGQQ VI AVGS PLGS EYATS VTQGI I S AKNRTVDVTNSAGQVTNQA 252 

Query: 226 NAIQTDAAINPGigS^^ 285 

IQTDAAINP|Ni^|f^|^||^ K++++ DG T+ VEG+GFAI PS ++ W+ 1 IN 
Sbjct: 253 TVIQTDAAINP^^Mm^^^^SMKLSSSSDG-TAVEGMGFAIPSDEWSIIN 311 

Query: 286 KLEADGKISRPALGIRMVDLSQLSTNDSSQLKLPXXXXXXXXXXXXXXXLPAASAGLKAG 345 

+L +GKI+RP LG+R+V + +L+ +L LP A AG+K+ 

Sbjct: 312 QLVKNGKITRPKLGVRWSVDELTEYGRKKLGLPDSVKSGVYVASVTKNGSADKAGIKSH 371 

Query: 346 DVI 348 
DVI 

Sbjct: 3 72 DVI 3 74 

> ref|ZP 00070364.1 | COG0265: Trypsin-like serine proteases, typically 
periplasmic, 

contain C-terminal PDZ domain [Oenococcus oeni MCW] 
Length =3 01 

Score = 181 bits (460), Expect = 2e-46 

Identities = 100/285 (35%), Positives = 148/285 (51%), Gaps = 12/285 (4%) 
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Query: 


10 


LLTGVVGGAI ALGGSAI YQXXXXXXXXXXXXXXXXXXXXXX - - XXXXXXXXXXAI KKXXX 


67 




LL+ + +G + LG ++ A K 




Sbjct : 


15 


LLSAI IGATWLGCFYLFYLAPAQNKAAKSSS IAAGM T K.V vvih i\j 1 booyAiAH ii\ j\v iun 


74 


Query : 


68 


XXXXXMNYQKDNSQXXXXX XXXXXXXXXXXXXXXXXXXXEGSGVIYKKSGGDAYV 


122 




Mvnv . EG+G+IY+ G +Y+ 




Sbjct : 


75 


AVvTVENYQKPSTEASDYF rEWrbbyobwoooooooo liiJyijilftrj^ioijJ. iniuvjrj io i ± 


134 


Query : 


123 


VTNYHVI AGNS S LDVLL SGGQ KVKAS WG YDE YTDLAVL KI S S EHVKDVAT FAD S S KLT I 


182 




VTN HVI G + ++++++ G KVKA ++G + D+AVL+ISS V TF + SSK+ 




Sbjct : 


135 


VTNNHVI KGAN E I E 1 1 MANGT KVKAKL I GKNAT KD I AVLRI S S AS VTTTGT F VNS S KVQ A 


194 


Query : 


183 


GEPAIAVGSPLGSQFANTATEGILSATSRQVTLTQENGQTTNINAIQTDAAINPGNSGGA 


242 




G+ +A+GSPLGS +A++ T GI+SAT+RQ+ + ++AIQTD A+NPGNSGG 




Sbjct : 


195 


GQQVLAIGSPLGSDYASSLTSGIVSATNRQI DDS P I KLS AI QTDVALNPGNSGGP 24 9 


Query: 


243 


LINIEGQVIGITQSKITTTEDGSTSVEGLGFAIPSNDWNIINKL 287 








LIN+ G+VIGI KI++TEDGS VEG+ F+IPSN W I + 




Sbjct: 


250 


LINMAGEVIGINSMKISSTEDGSEDVEGMSFSIPSNTWATIKSI 294 





> ref | ZP 00064063. l[ COG0265: Trypsin-like serine proteases, typically 
periplasmic, 

contain C-terminal PDZ domain [Leuconostoc mesenteroides 
subsp. mesenteroides ATCC 8293] 
Length =253 

Score = 180 bits (457) , Expect = 4e-46 

Identities = 93/182 (51%), Positives = 129/182 (70%), Gaps = 3/182 (1%) 
Query: 107 
Sbjct: 74 
Query: 167 
Sbjct: 134 
Query: 227 
Sbjct: 192 
Query: 287 
Sbjct: 251 

> ref|ZP 00070156. l| COG0750: Predicted membrane-associated Zn-dependent 
proteases 1 

[Oenococcus oeni MCW] 
Length =421 

Score =45.1 bits (105), Expect = 2e-05 

Identities = 26/56 (46%), Positives = 35/56 (62%), Gaps = 3/56 (5%) 

Query: 336 PAASAGLKAGDVITKVGDTAVTSSTDLQSALYSHNIND-TVKVTYYRDGKSNTADV 390 

PA GLK GDVITKV + +++ T L +A+ N+ D T+KV+Y R KS T V 
Sbjct: 218 PAMKQGLKKGDVITKVDSSKISNWTQLTTAI - -ENVGDKTMKVSYRRGNKSRTVTV 271 

> ref [NP 266705 . 1 1 UDP-N-acetylglucosamine 1-carboxyvinyltransf erase [Lactococcus 
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EGSGVI YKKSGGDA^gBjY^IAGNSSLDVLLSGGQKVKAS WGYDEYT|M|LKISSE 166 
EGSGV+YK SGG ^?|^^+A + L ++ + G+K++A++VG D gSf|LK + 
EGSGWYKISGGY^^^^VADSDELQLITASGKKIQATIVGTDSSKf^|LKAKTT 133 

HVKDVATFADSSKLTIGEPAIAVGSPLGSQFANTATEGILSATSRQVTLTQENGQTTNIN 226 

+K A+F ++ KL G+ +A+GSPLGS +A + T GI+SA R TL+ E ++ 
DIKTSASFGNAKKLQSGQQVLAIGSPLGSDYATSLTSGIVSAPRR- -TLSAEETGSSATT 191 

aiqtdaainp8I§Ig^ 286 

AIQTDAAINP^G^WiW SKI ++ DG TSVEG+GFAIP++ V I 

aiqtdaainp|i|^^^^^^sskiasstdg-tsvegmgfaipadivqtfikn 250 

LE 288 
E 

TE 252 
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lactis subsp. lactis] 
Length =427 

Score =34.7 bits (78), Expect = 0.034 

Identities = 40/153 (26%), Positives = 71/153 (46%), Gaps = 14/153 (9%) 

Query: 148 SWGYDEYTDLAVLKISSEHVKDVATFADSSKLTIGEPAIAVGSPLGSQF- -ANTATEGI 205 

+ + +D+ + K +SE +K A + SK+ +1 V P+ ++ A + G 

Sbjct: 65 TAI S FDQEAKKI I AKSNSE - 1 KTTAP YE YVS KM RAS I WMGPI LARNGQARVS MPGG 120 

Query: 206 LSATSRQVTLT QENGQTTNINAIQTDAAINPGNSGGALINIEGQVIGITQSKI- -T 259 

s SR + L ++ G T NA +A + GA I ++ +G TQ+ I 

Sbjct: 121 CSIGSRPIDLHLRGFEQMGATITQNAGYIEAKAD--KLKGAHIYLDFPSVGATQNLILAA 178 

Query: 260 TTEDGSTSVEGLGFAI PSNDWNI INKLEADGK 292 

T DG+T++E D+ N++NK+ A+ K 

Sbjct: 179 TLADGTTTLENAAREPE IVDLANLLNKMGANVK 211 

> ref[ZP 00046513. l| COG2996: Uncharacterized protein conserved in bacteria 
[Lactobacillus gasseri] 
Length =297 

Score = 33.1 bits (74), Expect = 0.097 

Identities = 22/68 (32%), Positives = 37/68 (54%), Gaps = 4/68 (5%) 

Query: 109 SGVIYKKSGGDAYWTNYHVIA- -GNSSLDVLLSGGQKVKASWGYDEY- -TDLAVLKIS 164 

SG +Y+ ++V+T+ + +A S + L GQK+KA V+G +Y +L+VL 

Sbjct: 157 SGTVYRNYEVGSFVITDQYYLAFVHKSEMFRPLRLGQKIKARVIGVSQYGRLNLSVLPRG 216 

Query: 165 SEHVKDVA 172 

E + D A 
Sbjct: 217 FEEIDDDA 224 

> ref]ZP 00063264. l| COG0750: Predicted membrane-associated Zn-dependent 
proteases 1 

[Leuconostoc mesenteroides subsp. mesenteroides ATCC 
8293] 
Length = 417 

Score =32.7 bits (73), Expect =0.13 

Identities = 18/56 (32%), Positives = 30/56 (53%), Gaps = 1/56 (1%) 

Query: 335 LPAASAGLKAGDVITKVGDTAVTSSTDLQSALYSHNINDTVKVTYYRDGKSNTADV 390 

+PA AGLKAGD IT++ D T++ D + ++ + +T R+G +V 
Sbjct: 212 MPADQAGLKAGDEITQI - DRVKTTTWDQVANAIGNSKESQLNITVLRNGHKKQVEV 266 



> ref|NP 785411. l| carboxy-terminal processing proteinase [Lactobacillus 
plant arum 

WCFS1] 
Length = 4 92 

Score =32.7 bits (73), Expect =0.13 

Identities = 20/56 (35%), Positives = 29/56 (51%), Gaps = 1/56 (1%) 

Query: 336 PAAS AGL KAGD V I T KVGDT AVT S S TDLQS - AL YS HN I NDT VKVT Y YRDGKS NT AD V 3 90 

PA AGLK D+I V +V T Q+ ++ I TVK+T R G++ T + 
Sbjct: 147 PAKKAGLKPKD 1 1 KAVNGKS VAGKTLTQAVSMMRGKIGTTVKLT I ERS GQTFTVS L 202 



7 



ANNEX 4 



> ref|NP 786668. l| extracellular protein [Lactobacillus plantarum WCFS1] 
Length = 190 

Score = 31.6 bits (70), Expect = 0.28 

Identities = 32/141 (22%), Positives = 55/141 (39%), Gaps = 32/141 (22%) 

Query: 187 IAVGSPLGSQFANTAT EGILSATSRQVTLTQENGQTTNINAIQTDA 232 

+ G PL Q A+T T EI T++ +TL Q G + I D+ 

Sbjct: 14 LMAGLPLVGQAADTETTTKAEVELIQDDTNKDITLDQAPGVSFGTEKITNDSKTYDAKNV 73 

Query: 233 AINPGNSGGALINIEGQVI GITQSKITTTEDGSTSV-EGLGFA 274 

NPGN+ G L+ ++G +T +++ T D + ++ + + 

Sbjct: 74 TGDLKVTNPGNTDGWLVQVKGSKFMNADDTRELRGAALTFAQVNATADDANNISKAKAYK 133 

Query: 275 I PSNDWNI INKLEADGKI SR 295 

+ D II EA+ I + 
Sbjct: 134 VDITDQNQI IMDAEANEGIGK 154 

> re£|NP 268285. l| hypothetical protein [Lactococcus lactis subsp. lactis] 
Length = 4 28 

Score = 30.8 bits (68), Expect = 0.48 

Identities = 20/55 (36%), Positives = 28/55 (50%), Gaps = 1/55 (1%) 

Query: 336 PAASAGLKAGDVITKVGDTAVTSSTDLQSALYSHNINDTVKVTYYRDGKSNTADV 390 

PA +AGLKAGD IVT ++ + + S+ +K+ R GKS T V 

Sbjct: 225 PAYNAGLKAGDKIEAVNGTKTADWNNWTEI - SGSKGKELKLEVSRSGKSETLSV 278 

> ref|NP 267651. l[ sugar ABC transporter substrate binding protein [Lactococcus 
lactis 

subsp. lactis] 
Length =483 

Score = 29.6 bits (65), Expect = 1.1 

Identities = 20/77 (25%), Positives = 35/77 (45%), Gaps = 1/77 (1%) 

Query: 125 NYHVI AGNS SLD VLLSGGQKVKAS VVGYDE YTDLAVLKI SS EHVKDVATFADS S KLT I GE 184 

NY + N++ + G K+ S +G+ +Y + +SS D+A FA + 

Sbjct: 4 9 NYKELMANANKILEKKAGVKLDISYIGWGDYAQKMNVIVSSGEAYDIA-FAQDYATNAAK 107 

Query: 185 PA I A VGS P LGS Q FANT A 201 

A A + L ++A TA 
Sbjct: 108 GAFADLTDLAPKYAKTA 124 

> re£|NP 687067. l| peptidase, M23/M37 family [Streptococcus agalactiae 2603V/R] 
ref|NP 734500. l| Unknown [Streptococcus agalactiae NEM316] 
Length =2 99 

Score =29.6 bits (65), Expect =1.1 

Identities = 27/134 (20%), Positives = 50/134 (37%), Gaps = 7/134 (5%) 

Query: 239 SGGALINIEGQVIGITQSKITTTEDGSTSVEGLGFAI PSNDWNI INKLEADGKI SRPAL 298 

S G+ + + V I +ITT +G G+ +A+P+ ++ + ADG + 
Sbjct: 20 SAGSRVLADT YVRP IDNGR I TTGFNG YPGHCGVD YAVPTGT 1 1 RAV ADGTVKFAGA 75 

Query: 299 GIRMVDLSQLSTNDSSQLKLPXXXXXXXXXXXXXXXLPAASAGLKAGDVITKVGDTAVTS 358 
G ++ L+ N + + + +K GD+I VG T + + 
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Sbjct: 76 GANFSWMTDLAGN CVMIQHADGMHSGYAHMSRVVARTGEKVKQGDIIGYVGATGMAT 132 

Query: 359 STDLQSALYSHNIN 372 

L N N 

Sbjct: 133 GPHLHFEFLPANPN 146 

> re£|NP 784951. l| cell surface SD repeat protein precursor [Lactobacillus 
plantarum 

WCFS1] 
Length = 3360 

Score = 29.3 bits (64), Expect - 1.4 

Identities = 42/145 (28%), Positives = 57/145 (39%), Gaps = 17/145 (11%) 

Query: 164 SSEHVKDVATFADSSKLTIGEPAIAVGSPLGS QFANTATEGILSATSRQVTLT 216 

S + V + SS LT+ A GS L AN T ++ +V + 

Sbjct: 1167 SYDAVDSAGLLSTSSSLTVTIKAGYTGSLLFQAVQGFSWDLANWFTVYTFASNLAEVDVY 1226 

Query: 217 QENGQTTNINAIQTDAAINPGN-SGGALINIEGQVIGITQ SKIT-TTEDGS-TSV 268 

N TNI+ D INP N S G+ + Q T KIT TT D S ++ 

Sbjct: 1227 SSNIPATNISIAGDDYVINPTNSSSGSNDKVTSQFTSTTNPENATGKITWTTSDSSIATI 1286 

Query: 269 EGLG-FAIPSNDWNIINKL-EADG 291 

+ G + SN V I + ADG 
Sbjct: 1287 DDSGLLTWSNGTVTITATITNADG 1311 



> ref|NP 786644. l| extracellular protein, gamma-D-glutamate-meso-diaminopimelate 
muropeptidase (putative) [Lactobacillus plantarum WCFS1] 
Length =370 

Score =29.3 bits (64), Expect =1.4 

Identities = 29/111 (26%), Positives = 42/111 (37%), Gaps = 3/111 (2%) 

Query: 132 NS S LD VL LS GGQ KVKAS WGYDE Y TD LAVLK I S S EHVKD VAT FADS S KLT I GE P AI AVGS 191 

+SS+ S AS V T + SS V AT S+ + A + 

Sbjct: 144 SSSVAAQSSSTSTASASSVTSSASTSSVASQASSSAVTSSATSQSSASQSSASQASQSST 203 

Query: 192 PLGSQFANTATEGILSATSRQVTLTQENGQTTNINAIQTDAAINPGNSGGA 242 

P+ S + TAT +ATS T+Q+ +N + TA SA 
Sbjct: 204 PVASSTSTTATSTQSAATS TSSQASSTASNTTSSSTTTATATAYSASA 251 



> ref |NP 687Q90.l| alcohol dehydrogenase, propanol -preferring [Streptococcus 
agalactiae 2603V/R] 
Length = 338 

Score =28.9 bits (63), Expect =1.8 

Identities = 19/69 (27%), Positives = 36/69 (52%), Gaps = 3/69 (4%) 

Query: 112 I Y KKS GG - D AY WTNYHVI AGNS S LD VLLSGGQ KVKAS WGYDE YTDLAVLKI S SEHVKD 170 

I +K+GG WT +A N ++D + +GG V + EY +L+++K + + + 
Sbjct: 224 IQEKTGGCHGVWTAVSKVAFNQAIDSVRAGGTVVAVGLP- -SEYMELSIVKTVLDGIRV 281 

Query: 171 VATFADSSK 17 9 

V + + K 
Sbjct: 282 VGSLVGTRK 290 



> ref|NP 267008. 1[ hypothetical protein [Lactococcus lactis subsp . lactis] 
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Length = 1063 
Score =28.9 bits (63), Expect =1.8 

Identities = 53/272 (19%), Positives = 99/272 (36%), Gaps = 30/272 (11%) 

Query: 120 AYWTNYHVIAGNSSLDVLLSGGQKVKASWGYDEYTDLAVLKISSEHVKDVATFADSSK 179 

A ++ + + N D + Q++ G E T + L S+ DVA A ++ 

Sbjct: 34 AIIIVSGTITDQNVKADTAIDSSQQIS GITEVTSYSALASSTN- -SDVA- -ASQNQ 85 

Query: 180 LTIGEPAIAVGSPLGSQFANTATEGILSATSRQVTLTQENGQTTNINAIQTDAAINPGNS 239 

+ + + + + T TEGI S S E+ TT+ IQT P N+ 

Sbjct: 86 VAYEQASDQSSNKSLANTVETDTEGITSNVSDSSNSINESQNTTSTWIQT PTNN 140 

Query: 24 0 GGALINIEGQVIGITQSKITTTEDGSTSVEGLGFAIPSND-WNIINKLEADGKISRPAL 298 

++ + s ++ ++GS S+ ASDV + +G+S + 

Sbjct: 141 IVSLADSS-SSNDNGSNSILSSSNAADSVDSAVGSQSSTSSSGVLSESS- 188 

Query: 299 GIRMVDLSQLSTNDSSQLKLPXXXXXXXXXXXXXXXLPAASAGLKAGDVITKVGDTAVTS 358 

+D S + SS++ L + ++T+ A + 

Sbjct: 189 AIDSGIASVSQSSEMNLVGNSSASASSAAVASFTAILATNPSMVPMLTQALiAAAAPA 245 

Query: 359 STDLQSALYSHNINDTVKVTYYRDGKSNTADV 390 

+T SA+ + + D V G S A++ 

Sbjct: 246 TTS - GSAILNTTLGDLVNQAI STVGI SGLANI 276 

> ref|NP 734524. l| Unknown [Streptococcus agalactiae NEM316] 
Length = 338 

Score =28.9 bits (63), Expect =1.8 

Identities .= 19/69 (27%), Positives = 36/69 (52%), Gaps = 3/69 (4%) 

Query: 112 IYKKSGG-DAYWTNYHVIAGNSSLDVLLSGGQKVKASWGYDEYTDLAVLKISSEHVKD 17 0 

I +K+GG WT +A N ++D + +GG V + EY +L+++K + ++ 
Sbjct: 224 IQEKTGGCHGVWTAVSKVAFNQAIDSVRAGGTVVAVGLP- -SEYMELSIVKTVLDGIRV 281 

Query: 171 VATFADSSK 179 

V + + K 
Sbjct: 282 VGSLVGTRK 290 

> ref|ZP 00064050.1) COG1364: N-acetylglutamate synthase (N-acetylornithine 
aminotransferase) [Leuconostoc mesenteroides subsp. 
mesenteroides ATCC 8293] 
Length = 344 

Score = 28.5 bits (62), Expect = 2.4 

Identities = 38/195 (19%), Positives = 77/195 (39%), Gaps = 22/195 (11%) 

Query: 204 GILSATSRQVTLTQENGQTTNINAIQTDAAINPGNSGGALINI EGQVIGITQSK 257 

G+ + Q Q + +T +Q +N GN+ +1 Q Q 

Sbjct: 51 GVFTTNLVQAAPVQLDKKTIRNGQLQA- 1 IVNSGNANAVTGSIGVSHAESMQEFTAQQLN 109 

Query: 258 ITTTEDGSTSVEGLGFAIPSNDWNIINKLEADGKISRPALGIRMVDLSQLSTNDSSQLK 317 

I T+ G S +G +P + ++N I +L+ DG+AI D+S S++ 
Sbjct: 110 IDTSLVGVASTGIIGKVLPIDKIINGIKQLKIDGDTNGFAHAIMTTDTKEKSITIQSTIQ 169 

Query: 318 LPXXXXXXXXXXXXXXXLPAASAGLKAGDVITKVGDTAVTSSTDLQSALYSHNINDTVKV 377 

A +G+ ++ T +G +T+ ++ + L +++ V+ 
Sbjct: 170 GKIVTMSGV - AKGSGMLHPNMATMLG- - FITTDINIDAKLLQQALSEDVET 217 
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Query: 378 TY YR DGKSNTAD 389 

++ + DG ++T D 
Sbjct: 218 SFNQITIDGDTSTND 232 

> ref|ZP O0Q63238.l| C0G1674: DNA segregation ATPase FtsK/SpoIIIE and related 
proteins 

[Leuconostoc mesenteroides subsp. mesenteroides ATCC 
8293] 
Length = 368 

Score =28.1 bits (61), Expect =3.1 

Identities = 29/108 (26%), Positives = 42/108 (38%), Gaps = 7/108 (6%) 

Query: 194 GSQFANTATEGILSATSRQVTLTQENGQTTNINAIQTDAAINPGNSGGALINIEGQVTGI 253 

G+ +NT 1+ Q + + T TD I NS A N + ++ 

Sbjct: 253 GAFISNTDVTNIVEFVKSQQEVQYSDAMTV TDEEIAQDNSENADGNSDDELFQE 306 

Query: 254 TQSKITTTEDGSTS VEGLGFAI PSNDWNI INKLEADGKISRPALGIR 301 

+ + STS+ FIN +1+ LEA G I PA G R 
Sbjct: 307 ALQFVIEQQKASTSLLQRRFRIGYNRAARLIDDLEAGGYIG-PADGSR 353 

> re£|NP 735868. l[ Unknown [Streptococcus agalactiae NEM316] 
Length =414 

Score =28.1 bits (61), Expect =3.1 

Identities = 35/165 (21%), Positives = 60/165 (36%), Gaps = 19/165 (11%) 

Query: 13 7 VLLSGGQKVKAS--WGYDEYTDLAVLKISSEHVKDVATFADSSKLTIGEPAIAVGSPLG 194 

V + G K+ A +V YD T A ++ + VA ++ K T PA+ + 
Sbjct: 86 VTVKVGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGSLPAMELSDQSS 145 

Query: 195 SQFANTATEGILSATSRQVTLTQE NGQTTNINAIQTDAAINPGNSGGALINIEG 248 

S T+ AT+R Q N Q ++N DA + AL 
Sbjct: 146 SSSQGQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKAL 200 

Query: 249 QVIGI TQSKI TTTEDGSTS VEGLGFAI PSNDWNI INKLEADGKI 293 

+ T++ T VE P++ ++ + +GK+ 

Sbjct: 201 NDTVITSDVSGTWEVNSDIDPASKTSQVLVHVATEGKL 239 

> ref|NP 784552. l| acetyltransf erase (putative) [Lactobacillus plantarum WCFS1] 
Length = 171 

Score = 28.1 bits (61), Expect = 3.1 

Identities = 28/72 (38%), Positives = 40/72 (55%), Gaps = 13/72 (18%) 

Query: 165 SEHVKDV--ATFADSSKLTIGEPAIAVGSPLGSQFANTAT--EGI- -LSATSRQVTLTQE 218 

+E V DV A AD+++L +A+ + LG + +NT T EGI LS T Q + + 

Sbjct: 2 AE E WDVRPAE VADAAQL LALLAQLGRE-SNTFTVDEGIEDLSETDEQAQIERI 54 

Query: 219 NGQTTNINAIQT 230 

NG TTNI + T 
Sbjct: 55 NGTTTNI I FVAT 66 

> ref | ZP 00070200 . 1 | COG1477: Membrane -associated lipoprotein involved in 
thiamine 

biosynthesis [Oenococcus oeni MCW] 
Length = 358 
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Score = 28.1 bits (61), Expect = 3.1 

Identities = 12/45 (26%), Positives = 24/45 (53%) 

Query: 117 GGDAYWTNYHVIAGNSSLDVLLSGGQKVKASVVGYDEYTDLAVL 161 

GG+ YV+ H +G +V + + + S VGY +D++++ 
Sbjct: 210 GGNIYVIGKSHPTSGTRDWNVGIQNPNQSRGSSVGYVRESDMSIV 254 

> ref[ZP 00063415. l| COG1668: ABC-type Na+ efflux pump, permease component 
[Leuconostoc 

mesenteroides subsp. mesenteroides ATCC 8293] 
Length = 43 8 

Score =27.7 bits (60), Expect =4.1 

Identities = 27/111 (24%), Positives = 47/111 (42%), Gaps = 8/111 (7%) 

Query: 131 GNSSLDVLLSGGQKVKASWGYDEYTDLAVLKISSEHVKDVATFAD- -SSKLTIGE-PAI 187 

GN++ ++ + G Q+V++ +V ++ D+ V I++E + A + LT+ + A 

Sbjct: 80 GNTTPNIAWGNQEVRSILVQSEKELDIHVSNITNEKKANTALQNEKLDGVLTVNKNEAT 13 9 

Query: 188 AVGSPLGSQFANTATEGILSATSRQVTLTQENGQTTNINAIQTDAAINPGN 238 

p Q IL SR TQ + A QT + P N 

Sbjct: 140 ITTQPKSEQIPKEKITAILGNLSRSQKATQ YGLTAEQTADLVQPYN 185 

> ref|NP 785643. l| endopeptidase La (putative) [Lactobacillus plantarum WCFS1] 
Length = 34 8 

Score =27.7 bits. (60), Expect =4.1 

Identities = 14/42 (33%),. Positives = 20/42 (47%) 

Query: 342 LKAGDVI TKVGDTAVTS STDLQSALYSHNINDTVKVTYYRDG 383 

LK GD ITKV +++ Q + + V +TY R G 

Sbjct: 149 LKVGDTITKVDGHHFNTASAYQHYIGKQGVGHRVTITYRRKG 190 

> ref|Z? 00069981. l| COG3051: Citrate lyase, alpha subunit [Oenococcus oeni MCW] 
Length = 44 9 

Score =27.7 bits (60), Expect =4.1 

Identities = 24/112 (21%), Positives = 45/112 (40%), Gaps = 4/112 (3%) 

Query: 193 LGSQFANTATEGILSATSRQVTLTQENGQTTNINAIQTDAAINPGNSGGALINIEGQVIG 252 

LG + A + + + V ++G TNI + ++ S G L N V 

Sbjct: 26 LGI KDLTLAPS S LTNVMNDMWKA I KSGT I TN I TS SGMRGS LGDAVSHGLLKN - - P WFR 83 

Query: 253 ITQSKITTTEDGSTSVEGLGFAIPSNDWNIINKLEADGKISRPALGIRMVD 304 

++ E+G ++ +P++D V N +E D +LG ++D 

Sbjct: 84 SHGNRARAIEEGKIKIDVAFLGVPNSDEVGNANGMEGDAAFG- -SLGYALMD 133 

> ref|ZP 00064376.l[ COG1364: N-acetylglutamate synthase (N-acetylornithine 
aminotransferase) [Leuconostoc mesenteroides subsp. 
mesenteroides ATCC 8293] 
Length = 346 

Score = 27.7 bits (60), Expect = 4.1 

Identities = 28/138 (20%), Positives = 59/138 (42%), Gaps = 15/138 (10%) 
Query: 255 QSKITTTEDGSTSVEGLGFAIPSNDWKIINKLEADGKISRPALGIRMVDLSQLSTNDSS 314 
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Q I T+ G S +G +P + ++N I +L+ DG+AI D+S S 
Sbjct: 49 QLNIDTSLVGVASTGIIGKVLPIDKIINGIKQLKIDGDTNGFAHAIMTTDTKEKSITIQS 108 

Query: 315 QLKLPXXXXXXXXXXXXXXXLPAASAGLKAGDVITKVGDTAVTSSTDLQSALYSHNINDT 374 

++ A +G+ ++ T +G +T+ ++ + L +++ 

Sbjct: 109 TIQGKIVTMSGV AKGSGMLHPNMATMLG- - FITTDINIDAKLLQQALSED 156 

Query: 3 75 VKVTYYR DGKSNTAD 389 

V+ ++ + DG ++T D 
Sbjct: 157 VETS FNQ I T I DGDT STND 174 

> refjNP 688903.ll membrane-associated zinc metalloprotease, putative 
[Streptococcus 

agalactiae 2603V/R] 
ref|NP 736335 . 1 | Unknown [Streptococcus agalactiae NEM316] 
Length =419 

Score =27.7 bits (60), Expect =4.1 

Identities = 14/31 (45%), Positives = 20/31 (64%) 

Query: 336 PAASAGLKAGDVITKVGDTAVTSSTDLQSAL 366 

PAASAGLK D I ++G V++ L +A+ 
Sbjct: 212 PAASAGLKNNDRILQIGSHKVSNWEQLTAAV 242 

> ref|NP 268318. 1| hypothetical protein [Lactococcus lactis subsp. lactis] 
Length = 342 

Score = 27.3 bits (59), Expect = 5.3 

Identities = 13/40 (32%), Positives = 20/40 (50%) 

Query: 342 LKAGDVI TKVGDTA VTSSTDLQS ALYS HN INDTVKVT YYR 381 

L+ D IT V TSS D+ + + + D+V + Y R 

Sbjct: 151 LELADTITAVNGQQFTSSADMIAYVSKQKVGDSVTIEYTR 190 

> ref|ZP 00046283. l[ COG0507: ATP-dependent exoDNAse (exonuclease V), alpha 
sub unit - 

helicase superfamily I member [Lactobacillus gasseri] 
Length = 792 

Score = 27.3 bits (59), Expect = 5.3 

Identities = 24/96 (25%), Positives = 43/96 (44%), Gaps - 11/96 (11%) 

Query: 233 AINPGNSGGALIN IEGQVIGITQSKITTT EDGSTSVEGLGFAI PSNDW 281 

A N G + G +N + G ++ I QS ++T +D T LA +D+ 

Sbjct: 201 ADNIGQALGIELNDPKRVRGAILSILQSALSTLGDTYVALDDLLTQAYDLVQASSYDDLA 260 

Query: 282 NI INKLEADGKI SRPALGIRMVDLSQLSTNDSSQLK 317 

N +N+L+ GK+ + + Q + S++LK 

Sbjct: 261 NSVNELQRQGKWVSGDKAALQGI FQTELDI SNELK 296 

> refjZP 00069420. l| COG3480: Predicted secreted protein containing a PDZ domain 
[Oenococcus oeni MCW] 
Length =364 

Score = 27.3 bits (59), Expect = 5.3 

Identities = 14/44 (31%), Positives = 22/44 (50%) 
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Query: 342 LKAGDVITKVGDTAVTSSTDLQSALYSHNINDTVKVTYYRDGKS 3 85 

+K GD ITKV +S Q L + + + V +T R+ K+ 

Sbjct: 153 I KVGDT I T KVDGKH FNNS AG YQ KYLAAM P VGE KVTLT VLRNN KT 196 

> ref[ZP 00063200. lj COG0827: Adenine- specif ic DNA methylase [Leuconostoc 
mesenteroides 

subsp. mesenteroides ATCC 8293] 
Length = 329 

Score =26.9 bits (58), Expect =7.0 

Identities = 20/66 (30%), Positives = 30/66 (45%), Gaps = 1/66 (1%) 

Query: 251 IGITQSKITTTEDGSTSVEGLGFAIPSNDVVNI I -NKLEADGKISRPALGIRMVDLSQLS 309 

I + I ED ++ F PSNDW II + ++ D + PA + + L+ L 

Sbjct: 26 I S YI DAL I E I LED INSQTVHRE FDKPSND WQ 1 1 QST I DMDWS LLS PAE KRKALQLAVLK 85 

Query: 310 TNDSSQ 315 

N Q 
Sbjct: 86 ANREDQ 91 

> ref |ZP 00062802. l[ hypothetical protein [Leuconostoc mesenteroides subsp. 
mesenteroides ATCC 8293] 
Length = 179 

Score = 26.9 bits (58), Expect =7.0 

Identities = 28/135 (20%), Positives = 52/135 (38%), Gaps = 5/135 (3%) 

Query: 128 VIAGNSSLDVLLSGGQKVKASWGYDEYTDLAVLKISSEHVKDVATFADSSKLTIGEPAI 187 

+ L LSGQ + A++ + T + V ++ + K + + K G + 

Sbjct: 7 LFGSEKKLSQLKSTGQ-INATRLARNNDTPVLVAPVTGDLQKITDSRDEPFKTKNGVMLV 65 

Query: 188 AVGSPLGSQFANTATEGILSATSRQVTLTQENGQTTNINAIQTDAAINPGNSGGALINIE 247 

L + + TE +T+ +TLT + QT + + T + G 
Sbjct: 66 PHSGNLMAPVSGIVTE STNDYLTLTDI SEQTVTVTWGTSNWRLAQYGVGQQLHA 121 

Query: 248 GQVIGITQSKITTTE 262 

G VIG T K+ + + 
Sbjct: 122 GDVI GTTNQKVLSAD 136 

> ref|NP 687818.1| major facilitator family protein [Streptococcus agalactiae 
2603V/R] 

Length =383 

Score =26.9 bits (58), Expect =7.0 

Identities = 13/26 (50%), Positives = 17/26 (65%) 

Query: 2 AKANIGKLLLTGWGGAIALGGSAIY 27 

A NIGK L T +VG +A+G + IY 
Sbjct: 141 AS LN I GKALTT F I VGL VLA I G VNY I Y 166 

> ref|NP 688359.ll conserved hypothetical protein [Streptococcus agalactiae 
2603V/R] 

Length =414 
Score = 26.9 bits (58), Expect = 7.0 

Identities = 35/165 (21%), Positives = 59/165 (35%), Gaps = 19/165 (11%) 
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Query: 137 VLLSGGQKVKAS - - WGYDEYTDLAVLKISSEHVKDVATFADSSKLTIGEPAIAVGSPLG 194 

V + G K+ A +V YD T A ++ + VA ++ K T PA+ 
Sbjct: 86 VTVKVGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGSLPAMESSDQSS 145 

Query: 195 SQFANTATEGILSATSRQVTLTQE NGQTTNINAIQTDAAINPGNSGGALINIEG 24 8 

S T+ AT+R Q N Q ++N DA + AL 
Sbjct: 146 SSSQGQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKAL 200 

Query: 249 QVIGITQSKITTTEDGSTSVEGLGFAIPSNDWNIINKLEADGKI 293 

+ T++ T VE P++ ++ + +GK+ 

Sbjct: 201 NDTVI TS D VS GT WE VNS DID PAS KTS QVL VHVATEGKL 239 

> ref |NP 688028 . 1 1 sensor histidine kinase, putative [Streptococcus agalactiae 
2603V/R] 

re£|NP 735501. l| Unknown [Streptococcus agalactiae NEM316] 
Length = 579 

Score =26.9 bits (58), Expect =7.0 

Identities = 12/47 (25%), Positives = 28/47 (59%), Gaps = 1/47 (2%) 

Query: 147 ASWGYDEYTDLAVLKISSEHVKDVATFADSSKLTIGEPAIAVGSPL 193 

+ + G + +DL+++ + H+ D ++ A++ LTIG + +G P+ 
Sbjct: 54 SNFTGVEIQSDLSIIPQTLNHIADQSSVANTRVLTIGVSGL-IGGPI 99 

> re£|NP 735272. 1| Unknown [Streptococcus agalactiae NEM316] 
Length = 383 

Score = 26.9 bits (58), Expect = 7.0 

Identities = 13/26 (50%) , Positives = 17/26 (65%) 

Query: 2 AKANIGKLLLTGWGGAIALGGSAIY 27 

A NIGK L T +VG +A+G + IY 
Sbjct: 141 ASLNIGKALTTFIVGLVLAIGVNYIY 166 

> ref|NP 786635. l[ extracellular protein [Lactobacillus plantarum WCFS1] 
Length =322 

Score =26.9 bits (58), Expect =7.0 

Identities = 38/158 (24%), Positives = 65/158 (41%), Gaps = 14/158 (8%) 

Query: 140 SGGQKVKASWGYDEYTDLAVLKISSEHVKDVATFA DSSKLTIGEPAIAVGSPLGSQ 196 

S G + ++ Y+ ++ I + + K+ A A D++ I L 

Sbjct: 104 SSGSGINVKILNYNGSNNITT- - ITANQYKNAALTAGITDANIYVTSATPIDGSGALAGV 161 

Query: 197 FANT ATEG I L S ATS RQ VTLTQENGQT TNINAIQ TDAAINPGNSGGALINIEGQ 249 

+A A G S ++QVT Q+ T T N + TD+ +N +G A + + 
Sbjct: 162 YAAYAKSGN-SLNTKQVTAAQDELSTLSGITQANKSKDGYTDSQLNNAVAG-AKKEMAQK 219 

Query: 250 VIGITQSKITTTEDGSTSVEGLGFAIPSNDWNIINKL 287 

IT+++ITT + + L I +N UN L 
Sbjct: 220 GSNITKNEITTIVNQQITNNNLTNVITNNQKTEIINLL 257 

> ref[ZP 00046678. l| COG4653: Predicted phage phi-C31 gp36 major capsid-like 
protein 

[Lactobacillus gasseri] 
Length = 392 



15 



ANNEX 4 



Score = 26.9 bits (58), Expect = 7.0 

Identities = 16/64 (25%), Positives = 28/64 (43%), Gaps = 11/64 (17%) 

Query: 140 SGGQKVKASWGYDEYTDL AVLKISSEHVKDVATFADSSKLTIGEPAIA 188 

+G KA + +D+ DL AV ++ + VK + D + I +P+ + 

Sbjct: 248 AGSTAAKADALTFDDLIDLFYSLKAPYRQNAVFLMNDDTVKAIRKMKDKNDQYIWQPSVQ 307 

Query: 189 VGSP 192 
VG P 

Sbjct: 308 VGQP 311 

> ref|ZP 00046947. 1| COG2931: RTX toxins and related Ca2+-binding proteins 
[Lactobacillus 

gasseri] 
Length =1991 

Score =26.9 bits (58), Expect =7.0 

Identities = 30/113 (26%), Positives = 43/113 (38%), Gaps = 7/113 (6%) 

Query: 114 KKSGGDAYWTNYHVIAGNSSLDVLLSGGQKVKASWGYDEYTDLAVLKISSEHVKDVAT 173 

K + G+ +T V AG L + S ++VKA YD + + V +AT 

Sbjct: 1066 KDADGNYVAMTGNPVNAGTYYLHLTKSAIEQVKADNSNYDFTSVNGEFTYTINAVNGIAT 1125 

Query: 174 FADSSKLTIGEPAIA VGSPLGSQFANTATEGILSATSRQVTLTQENGQTT 223 

+ SST A+ VSG N G +SQT +GT 
Sbjct: 1126 LSGS SS KT YDGQA VTTAE VNS TNGD 1 1 VNFTFPG SSAQSTYVLQTGDYT 1174 



> ref[ZP 00046780. l| COG3210: Large exoproteins involved in heme utilization or 
adhesion 

[Lactobacillus gasseri] 
Length = 3692 

Score =26.9 bits (58), Expect =7.0 

Identities = 28/112 (25%), Positives = 45/112 (40%), Gaps = 7/112 (6%) 

Query: 204 GILSATSRQVTLTQENGQTTNINAIQTDAAINPGNSGGALINIEGQVIGITQSKITTTED 263 

G L +T +NG+ T + Q A+ + GI+ IQST + D 

Sbjct: 2225 GNLVTVDEDGNITSQNGKITWNHESQEFEAVPAIDHDGYYIS SINQSNSTASVD 2278 

Query: 264 GSTSVEGLGFAIPSNDWNIINKLEADGKISRPALG-IRMVDLSQLSTNDSS 314 

G T G P++ NI+ L + + AGI+D+ T +S + 

Sbjct: 2279 GQTGAVGTET VTPNSQNGNI VI TLTRNPDVPVAAQGS INY I DDTTGQT I ES A 2330 

> ref|ZP 00062638. l| COG4932: Predicted outer membrane protein [Leuconostoc 
mesenteroides subsp. mesenteroides ATCC 8293] 
Length =508 

Score = 26.6 bits (57), Expect =9.1 

Identities = 33/138 (23%), Positives = 50/138 (36%), Gaps = 29/138 (21%) 

Query: 115 KSGGDAYWTNYHVI AGNS SLD VLLSGGQKVKASWGYDEYTDLA 159 

KSG D+ + ++HVND VLGKV + G L 

Sbjct: 171 KSGSDSEINSDVHVYPKNEQTDAITKDLSDESKKDLIVTLPDGSKVYNATYGQKFGYQLQ 230 

Query: 160 VLKISSEHVKDVATFADSSKLTIGEPAIAV GSPLGSQFANTATEGILSATSRQVTLT 216 

+ + KD D+LI+AV G G++ +AT+ T 
Sbjct: 231 IAVPWNIADKDTFNWDTPNLGIDDDATTVKVAGLTKGTDYTVSATDA T 279 
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Query: 217 QENG Q TTN I NA I QTD AA I 234 

+NG++ I T AA+ 
Sbjct: 280 DKNGKS FKI TFNPTAAAV 297 



> ref |NP 687888. l| exonuclease RexA [Streptococcus agalactiae 2603V/R] 
Length = 1207 

Score =26.6 bits (57), Expect =9.1 

Identities = 21/85 (24%), Positives = 33/85 (38%), Gaps = 8/85 (9%) 

Query: 292 KISRPALGIRMVTILSQLSTNDSSQLKLPXXXXXXXXXXXXXXXLPAASAGLKAGDVITKV 351 

KI P L I VD+ + T S KLP A+ G +++ ++ 

Sbjct: 1010 KIYEPILDIEGVDVMETITKTSVDFKLPDFSTSKKQ DPAALGSAVHELMQRI 1061 

Query: 352 GDTAVTSSTDLQSALYSHNINDTVK 376 

++ D+Q AL N +VK 

Sbjct: 1062 EMSSHVKMEDIQKALTEVNAETSVK 1086 



> ref |NP 687383 .1 1 3-oxoacyl- (acyl-carrier-protein) synthase II [Streptococcus 
agalactiae 2603V/R] 
ref |NP 734805 .1 1 Unknown [Streptococcus agalactiae NEM316] 
Length = 410 

Score =26.6 bits (57), Expect =9.1 

Identities = 30/148 (20%), Positives = 58/148 (39%), Gaps = 5/148 (3%) 

Query: 117 GGDAYWTNYHVIAGNSSLDVLLSGGQKVKASWGYDEYTDLAVLKISSEHVKDVATFAD 176 

GG +T + IAG SL L + +AS+ + + + S V + A+ 

Sbjct: 189 GGAEAAITKF-AIAGFQSLTALSTTEDPSRASIPFDKDRNGFIMGEGSGMLVLESLEHAE 247 

Query: 177 SSKLTIGEPAIAVGSPLGS-QFANTATEGILSATSRQVTLTQENGQTTNINAIQTDAAIN 235 

TI + G+ + + EG+ + +Q+L+N+ +N+ 
Sbjct: 248 KRGAT I LAE WGYGNTCDAYHMTS PHPEGLGATKAIQLALVEANI KPEEVNYVNAHGTS T 307 

Query: 236 PGNSGG ALINIEGQVIGITQSKITT 260 

PNG A++ G + ++ +K T 
Sbjct: 308 PANEKGE SQAI VAALGTD VPVS STKS FT 335 



Database: Unfinished Lactobacillus gasseri; Completed Lactobacillus 
plantarum WCFS1; Completed Lactococcus lactis subsp. lactis; 
Unfinished Leuconostoc mesenteroides subsp. mesenteroides ATCC 82 93; 
Unfinished Oenococcus oeni MCW; Completed Streptococcus agalactiae 
2603V/R; Completed Streptococcus agalactiae NEM316 

Posted date: Oct 29, 2003 1:28 AM 
Number of letters in database: 4,501,851 
Number of sequences in database: 15,229 

Lambda K H 

0.308 0,128 0.338 

Gapped 

Lambda K H 

0.267 0.0410 0.140 
Matrix: BLOSUM62 

Gap Penalties: Existence: 11, Extension: 1 
Number of Hits to DB: 4 04,3 82 
Number of Sequences: 1522 9 
Number of extensions: 143 71 
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Number of successful extensions: 43 

Number of sequences better than 10.0: 10 

Number of HSP's better than 10.0 without gapping: 4 

Number of HSP's successfully gapped in prelim test: 6 

Number of HSP's that attempted gapping in prelim test 

Number of HSP's gapped (non-prelim): 11 

length of query: 408 

length of database: 4,217,779 

effective HSP length: 94 

effective length of query: 314 

effective length of database: 2,884,107 

effective search space: 905609598 

effective search space used: 905609598 

T: 11 

A: 40 

XI: 16 ( 7.1 bits) 
X2: 38 (14.6 bits) 
X3: 64 (24.7 bits) 
SI: 42 (21.7 bits) 
S2: 57 (26.6 bits) 
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BLASTP 2.2.6 [Apr-09-2003] 

RID: 1065207448- 6660 -583997. BLASTQ3 



Query= 



(408 letters) 



Database: Completed Streptococcus mutans UA159 ; 



1,531,058 sequences; 495,743,110 total letters 



Taxonomy reports 



Sequences producing significant alignments: 



Score E 
(bits) Value 



ref 


NP 722446.1 


ref 


NP 722143.1 


ref 


NP 721869.1 


ref 


NP 721706.1 


ref 


NP 720929.1 


ref 


NP 721524.1 


ref 


NP 722399.1 


ref 


NP 721986.1 


ref 


NP 722066.1 


ref 


NP 722435.1 


ref 


NP 721103.1 


ref 


NP 720786.1 



serine protease HtrA [Streptococcus mutans. 
putative transcriptional regulator [Strept . 
putative UDP-N-acetylglucosamine 1-carboxy. 
putative bacitracin synthetase [Streptococ. 
putative polyribonucleotide nucleotidyltra. 
putative ABC transporter, phosphate-bindin. 
glucan-binding protein A, GbpA [Streptococ. 
putative D-3-phosphoglycerate dehydrogenas . 
putative 3-oxoacyl- (acyl-carrier-protein) . 
conserved hypothetical protein [Streptococ. 
phosphoenolpyruvate : sugar phosphotransf era . 
hypothetical protein [Streptococcus mutans. 



296 


5e-82 


28 


0.36 


28 


0.62 


27 


0.81 


27 


1.1 


27 


1.4 


26 


1.8 


26 


2.4 


26 


2.4 


25 


3.1 


24 


9.0 


24 


9.0 



Alignments 



> ref |NP 722446 .1 1 serine protease HtrA [Streptococcus mutans UA159] 
Length =4 02 

Score = 296 bits (759), Expect = 5e-82 

Identities = 153/286 (53%), Positives = 203/286 (70%), Gaps = 3/286 (1%) 

Query: 107 EGSGVIYKKSGGDMWS^ 166 
EGSGVIYKK G m4}/ITN§m^ L+++++ G+KV +VG D Y+D§AVi+KISS+ 
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Sbjct : 


107 




166 


Query : 


167 


HVKDVATFADSSKLTIGEPAIAVGSPLGSQFANTATEGILSATSRQVTLTQENGQTTNIN 


226 




+V VA FA+S K+ +GEPAIA+GSPLGS +AN+ TEGI+S+ SR VT ENG+T + N 






1 7 
J.O / 


YVTTVAEFANSDKIKVGEPAIAIGSPLGSDYANSVTEGIVSSLSRTVTSQNENGETISTN 


226 


Query: 


227 


AIQTDAAINPpp!^ 


284 




A T OTDAA TNPCT^(^^^iW^^E'GS^ SKI ++ + ++ + VEG+GFAI PSNDW+ 1 1 




nU-i /-it-. 

oDj CC : 


Z Z / 


a TOTn Aa TNPfSjtf^^ SNNSNSGVAVEGMGFAI PSNDWS 1 1 


286 


Query: 


285 


NKLEADGKISRPALGIRMVDLSQLSTNDSSQLKLPXXXXXXXXXXXXXXXLPAASAGLKA 


344 




N+LE +G++ RPALGI M +LS + ST+ LK+P +P A LK 




Sbjct: 


287 


NQLEENGEWRPALGI SMANLSEASTSGRDTLKI PSDVTSGI WLSTQSGMP - ADGKLKK 


345 


Query: 


345 


GDVITKVGDTAVTSSTDLQSALYSHNINDTVKVTYYRDGKSNTADV 390 






DVIT++ V S +DLQS LY H D +K+T+YR+ T ++ 




Sbjct : 


346 


YDVI TEI DGKKVAS I SDLQS I LYKHKKGDKI KLTFYREKDKQTVE I 391 





> ref |NP 722143. l| putative transcriptional regulator [Streptococcus mutans 
UA159] 

Length = 261 
Score =28.5 bits (62), Expect =0.36 

Identities = 34/128 (26%), Positives = 51/128 (39%), Gaps = 26/128 (20%) 

Query: 193 LGSQFANTATEGILSATSRQVTLTQENGQTTNINAIQTDAAIN PGNSGGALINIEGQ 249 

+G Q N TE L T R+ T T + + ++ AAI GN G + + 

Sbjct: 154 VGIQLLNLQTEN-LEETIRKQTAINMAINTLSYSEMKAVAAILNELDGNEGRLTASVIAD 212 

Query: 250 VI GI TQS KI TTTEDGS TS VEGLGFAI PSND WNI INKLEADGKI SRPALGI RMVDLSQLS 309 

IGIT+S I VN + KLE+ G I +LG++ L ++ 

Sbjct: 213 RIGITRSVI VNALRKLESAGIIESRSLGMKGTYLKVIN 250 

Query: 310 TNDSSQLK 317 
+LK 

Sbjct: 251 EGIFDKLK 258 



> re£|NP 721869. 1[ putative UDP-N-acetylglucosamine 1-carboxyvinyl transferase 
[Streptococcus mutans UA159] 
Length = 423 

Score = 27.7 bits (60), Expect = 0.62 

Identities = 34/146 (23%), Positives = 64/146 (43%), Gaps = 10/146 (6%) 

Query: 150 VGYDEYTDLAVLKISSEHVKDVATFADSSKLTIGEPAIAVGSPLGSQ- -FANTATEGILS 207 

V +DE + ++ + + + DVA + S++ +1 V P+ ++ A + G + 

Sbjct: 66 VDFDEERNQILVDATGD-ILDVAPYEYVSQM RASIWLGPILARNGHAKVSMPGGCT 121 

Query: 208 ATSRQVTLTQENGQTTNINAIQT- -DAAINPGNSGGALINIEGQVIGITQSKI - -TTTED 263 

SR + L + + QT D GA I ++ +G TQ+ + T D 

Sbjct: 122 IGSRPIDLHLKGLEAMGAKIQQTGGDITATADRLKGANIYMDFPSVGATQNLMMAATLAD 181 

Query: 264 GSTS VEGLGFAI PSND VTO I INKLEA 289 

G+T +E D+ N++NK+ A 

Sbjct: 182 GTTI IENAAREPEI VDLANLLNKMGA 207 
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> ref [NP_ 721706 . 1 | putative bacitracin synthetase [Streptococcus mutans UA159] 
Length = 1455 

Score = 27.3 bits (59), Expect = 0.81 

Identities - 20/63 (31%), Positives = 29/63 (46%), Gaps = 3/63 (4%) 

Query: 119 DA YWTN YH VI AGN S S LD VL LS GGQKVKAS WGYDE YT - D LAVL KI S S E HVKD VAT FAD S 177 

D V Y V G +D ++S K+K + Y E T +L S EH D+ T + 

Sbjct: 887 DQVKVNGYRVELGE- - IDSIISKMSKIKRAKTIYQEETGNLIAFCESKEHCSDIETRKEL 944 

Query: 178 SKL 180 
SK+ 

Sbjct: 945 SKI 947 



> ref [NP_720929.il putative polyribonucleotide nucleotidyltransferase (general 
stress 

protein 13) [Streptococcus mutans UA159] 
Length = 130 

Score = 26.9 bits (58), Expect = 1.1 

Identities = 13/25 (52%), Positives = 16/25 (64%) 

Query: 139 LSGGQKVKASWGYDEYTDLAVLKI 163 

LS GQ+V W YDEY+ A L + 
Sbjct: 50 LSVGQEVLVQWDYDEYSQKASLSL 74 



> ref|NP 721524.1] putative ABC transporter, phosphate-binding protein 
[Streptococcus 

mutans UA159] 
Length = 287 

Score = 26.6 bits (57), Expect = 1.4 

Identities = 17/44 (38%), Positives = 21/44 (47%), Gaps = 1/44 (2%) 

Query: 33 5 LP AAS AGLKAGDVI TKVGDTAVTSSTDLQS ALYS H - N INDTVKV 3 77 

LAS + G IT VG TA+ + S + H NI TV V 
Sbjct: 19 LAACSNWIDKGQSITSVGSTALQPLVEASSDEFGHANIGKTVNV 62 



> re£[NP 722399. l| glucan-binding protein A, GbpA [Streptococcus mutans UA159] 
Length = 565 

Score =26.2 bits (56), Expect =1.8 

Identities = 30/153 (19%), Positives = 54/153 (35%), Gaps = 12/153 (7%) 

Query: 172 ATFADSSKLTIGEPAIAVGSPLGSQFANTATEGILSATSRQVTLTQENGQTTNINAIQTD 231 

AT +SS+ +E A S N + +S+++ +G++ A+ 

Sbjct: 55 ATVQESSEQPVTEAPAA DS S VENNSANAVKS SETAEAAEVS DGGRASQTEAVTNQ 109 
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Query: 232 AAINPGNSGGALINIEGQVIGITQSKITTTEDGSTSVEGLGFAIPSNDWNIINKLEADG 291 

+ + G+ + + T+ A +ND + E DG 

Sbjct: 110 TNSEEHHPAEKATAVSGEAQSVQNAPSENAAQQETAKTEPATAAENNDAAPTNSFFEKDG 169 

Query: 292 K ISRPALGIRMVDLSQLSTN-DSSQLK 317 

K +AG++DQLND SQ+K 

Sbjct: 17 0 KWYYKKADGQLATGWQTIDGKQLYFNQDGSQVK 202 



> ref|NP 721986. l| putative D-3 -phosphoglycerate dehydrogenase [Streptococcus 
mutans 

UA159] 
Length =393 

Score =25.8 bits (55), Expect =2.4 

Identities = 17/78 (21%), Positives = 32/78 (41%), Gaps = 3/78 (3%) 

Query: 13 8 LLSGGQKVKASWGYDEYTDLAVLKISSEHVKDVATFADSSKLTIGEPAIAVGSPLGSQF 197 

+ + +++ +V+GYD Y + S HVK V D + I + PL + 

Sbjct: 151 IANDARRLGMNVLGYDPWSIETAWNISSHVKRVNDLKD IFENSDYITIHVPLNDET 207 

Query: 198 ANTATEGILSATSRQVTL 215 

NT + ++ T+ 

Sbjct: 208 KN T FN AD S F ALMNKGTT I 225 



> ref|NP 722066.1] putative 3 -oxoacyl- (acyl-carrier-protein) synthase 
[Streptococcus 

mutans UA159] 
Length = 410 

Score = 25.8 bits (55), Expect =2.4 

Identities = 27/136 (19%), Positives = 48/136 (35%), Gaps = 13/136 (9%) 

Query: 136 DVLLSGGQKVKASWGYDEYTDLAVLKISSEHVKDVATF-ADSSKLTIGEPAIAVGSPLG 194 

DV+L+GG + ++G + LL++ + FD+ +GE A 
Sbjct: 184 D V I LAGGS E AS I TKI G I GGFNALTALSTTEDPARS AI PFDKDRNGFVMGEGA 235 

Query: 195 SQFANTATEGILSATSRQVTLTQE-NGQTTNINAIQTDAAINPGNSGGALINIEGQVIGI 253 

A E+AR + EG+N+A G+ I + GI 

Sbjct: 236 AVLILESLEHAQKRGARILAEWGYGSNCDAYHMTTPTPDGSGAAKAIKLAINEAGI 292 

Query: 2 54 TQSKITTTEDGSTSVE 269 

+ ++ TS + 

Sbjct: 293 SPEEVNYVNAHGTSTQ 308 



> ref |NP 722435. l| conserved hypothetical protein [Streptococcus mutans UA159] 
Length = 325 

Score = 25.4 bits (54), Expect =3.1 

Identities = 18/61 (29%), Positives = 27/61 (44%), Gaps = 12/61 (19%) 

Query: 198 ANTATEG ILSATSRQVTLTQENGQTTNINAIQTDAAINPGNSGGALINIEGQVIGI 253 

A T+G L+AT + T+T G TT++ + G+ G I I GQ + 



ANNEX 5 

Sbjct: 258 ATNTTDGESGTTLTATDKTYTVTLAEGSTTSM LTVGSPSGVEITINGQKVDT 309 

Query: 254 T 254 
T 

Sbjct: 310 T 310 



> ref[NP 721103.l| phosphoenolpyruvate : sugar phosphotransferase system enzyme I, 
PTS 

system EI component [Streptococcus mutans UA159] 
Length =577 

Score =23.9 bits (50), Expect =9.0 

Identities = 19/104 (18%), Positives = 39/104 (37%), Gaps = 3/104 (2%) 

Query: 210 SRQVTLTQENGQTTNINAIQTDAAINPGNSGGALINIEGQVIGITQSKITTTEDGST-SV 268 

+ +T +NG +N I INP A G+ +++ +D T + 

Sbjct: 207 TNDITERVKNGDIVAVNGITGQVIINPTEDQIAEFKAAGETYAKQKAEWALLKDAETVTA 266 

Query: 269 EGLGFAIPSNDWNIINKLEADGKISRPALGIRMVDLSQLSTND 312 

+G F + +N + +E A+G+ + + + D 

Sbjct: 267 DGKHFELAAN- - IGTPKDVEGVNNNGAEAVGLYRTEFLYMDSQD 308 



> ref|NP 720786. l] hypothetical protein [Streptococcus mutans UA159] 
Length =411 

Score = 23.9 bits (50), Expect = 9.0 

Identities = 12/40 (30%), Positives = 19/40 (47%) 

Query: 151 GYDEYTDLAVLKISSEHVKDVATFADSSKLTIGEPAIAVG 190 

GY +T + +S + K T AD LT+G+ + G 
Sbjct: 301 GYAYFTS KD I KTVS EKS YKSDWTQADVDALTVGDS STGKG 340 



Database: Completed Streptococcus mutans UA159 

Posted date: Oct 1, 2003 10:43 PM 
Number of letters in database: 579,702 
Number of sequences in database: 1960 

Lambda K H 

0.308 0.128 0.338 

Gapped 

Lambda K H 

0.267 0.0410 0.140 



Matrix: BLOSUM62 

Gap Penalties: Existence: 11, Extension: 1 
Number of Hits to DB: 35,974 
Number of Sequences: 1960 
Number of extensions: 1254 
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Number of successful extensions: 5 

Number of sequences better than 10.0: 1 

Number of HSP's better than 10.0 without gapping: 0 

Number of HSP's successfully gapped in prelim test: 1 

Number of HSP's that attempted gapping in prelim test 

Number of HSP's gapped (non-prelim): 1 

length of query: 408 

length of database: 577,947 

effective HSP length: 81 

effective length of query: 327 

effective length of database: 419,835 

effective search space: 137286045 

effective search space used: 137286045 



T: 


11 








A: 


40 








XI 


16 


( 7 


1 


bits) 


X2 


38 
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X3 


64 
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SI 


42 
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ANNEX 6 

BLASTP 2.2.6 [Apr-09-2003] 

RID: 1065207742 -9645-69613. BLASTQ3 



Query= 



(408 letters) 



Database: Completed Streptococcus pneumoniae R6 ; 



1,531,058 sequences; 495,743,110 total letters 



Taxonomy reports 



Sequences producing significant alignments: 



Score E 
(bits) Value 



ref 


NP 359636.1 


ref 


NP 359374.1 


ref 


NP 357681.1 


ref 


NP 357856.1 


ref 


NP 357669.1 


ref 


NP 358044.1 


ref 


NP 359110.1 


ref 


NP 358390.1 


ref 


NP 359031.1 


ref 


NP 359032.1 



Serine protease [Streptococcus pneumoniae R6] 
Conserved hypothetical protein [Streptococ . . 
Hypothetical protein [Streptococcus pneumo.. 
Alcohol dehydrogenase, propanol-pref erring. . 
Conserved hypothetical protein [Streptococ . . 
EcoA type I restriction-modification enzym. . 
Penicillin-binding protein 2B [Streptococc . . 
6-phosphofructokinase I [Streptococcus pne.. 
Isochorismatase [Streptococcus pneumoniae R6] 
Transcriptional pleiotropic repressor [Str... 

Alignments 



325 


2e-90 


40 


2e-04 


29 


0.22 


28 


0.48 


27 


1.4 


25 


3.1 


25 


5.3 


24 


6.9 


24 


6.9 


24 


9.0 



> ref |NP 359636 . 1 | Serine protease [Streptococcus pneumoniae R6] 
Length = 3 97 

Score = 325 bits (832), Expect - 2e-90 

Identities = 167/284 (58%), Positives = 212/284 (74%), Gaps = 3/284 (1%) 

Query: 107 EGSGVIYKKSGGDAY^ 166 

EGSGVIYKK+ +AY^^pNjipi G S +D+ LS G KV +VG D ++DJAV+KISSB 
Sbjct: 106 EGSGV I YKKND KE A YJ^^^§^ INGAS KVD I RLS DGT KVPGE I VGADT F S D|ii||vKI S S E 165 
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Query: 167 HVKDVATFADSSKLTIGEPAIAVGSPLGSQFANTATEGILSATSRQVTLTQENGQTTNIN 226 

V VA F DSSKLT+GE AIA+GSPLGS++ANT T+GI+S+ +R V+L E+GQ + 
Sbjct: 166 KVTTVAE FGDSS KLTVGETAI AIGS PLGSEYANTVTQGIVSSLNRNVSLKSEDGQAI S TK 225 

Query: 227 AIQTDAAINP§^S!^^3^^^|gi^QSKITTTEDGSTSVEGLGFAI PSNDWNI INK 286 

AIQTD AINP^sS^^N^p#p| SKI T +G TS VEGLGFAI P+ND +NII + 
Sbjct: 226 AIQTDTAINPjSj^^ 283 

Query: 287 LEADGKISRPALGIRMVDLSQLSTNDSSQLKLPXXXXXXXXXXXXXXXLPAASAGLKAGD 346 

LE +GK++RPALGI+MV+LS +ST+D +L +P +P A+ L+ D 

Sbjct: 284 LE KNG KVTR PALG I QMVNL SNVS T S D I RRLN IPS NVT SG VI VRS VQ SNM P - ANGHLE KYD 342 

Query: 347 VITKVGDTAVTSSTDLQSALYSHNINDTVKVTYYRDGKSNTADV 3 90 

VITKV D + SSTDLQSALY+H+I DT+K+TYYR+GK T + 
Sbjct: 343 VITKVDDKEIASSTDLQSALYNHSIGDTIKITYYRNGKEETTSI 386 



> ref|NP 359374. l| Conserved hypothetical protein [Streptococcus pneumoniae R6] 
Length =345 

Score = 39.7 bits (91), Expect = 2e-04 
Identities = 19/48 (39%), Positives = 27/48 (56%) 

Query: 342 LKAGDVITKVGDTAVTSSTDLQSALYSHNINDTVKVTYYRDGKSNTAD 389 

L D +T V D SS DL + S + D+VKVTY DG++ +A+ 
Sbjct: 146 LN I S DTVTAVND QT FD S S KDL I D YVS S QKLGDS VKVT YE ED GQTKS AE 193 



> ref |NP 357681. l[ Hypothetical protein [Streptococcus pneumoniae R6] 
Length = 320 

Score = 29.3 bits (64), Expect = 0.22 

Identities = 21/72 (29%), Positives = 33/72 (45%), Gaps = 5/72 (6%) 

Query: 189 VGSPLGSQFANTATEGILSATSRQVTLTQENGQTTNINAIQTDAAINPGNSGGALINIEG 24 8 

+ S S+F T + + ++ GQTT INA +A + N+ ++ IEG 
Sbjct: 142 IASSYSSRFEEVILRLPKGRTLKGINISANRGQTTIINASLENATL NTNSYILRIEG 198 

Query: 249 QVIGITQSKITT 260 

I SK+TT 
Sbjct: 199 S- -RIKNSKLTT 208 



> ref |NP 357856 .1 1 Alcohol dehydrogenase, propanol -preferring . [Streptococcus 
pneumoniae R6] 
Length = 339 

Score = 28.1 bits (61), Expect = 0.48 

Identities = 18/73 (24%), Positives = 40/73 (54%), Gaps = 4/73 (5%) 

Query: 109 SGVIYKKSGGDAY- -WTNYHVIAGNSSLDVLLSGGQKVKASWGYDEYTDLAVLKISSE 166 

+G+I +K+ G A+ WT +A N ++D + +GG+ V + E +L+++K + 
Sbjct: 221 AGL I KEKTDGGAHS AWTAVS KVAFNQAVDS I RAGGRWAVGLP - - SEMMELS I VKTVLD 278 

Query: 167 HVKD VAT FADS S K 179 
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++ + + + K 
Sbjct: 279 GIQVIGSLVGTRK 291 



> ref |NP 357669 .1 | Conserved hypothetical protein [Streptococcus pneumoniae R6] 
Length = 1161 

Score =26.6 bits (57), Expect = 1.4 

Identities = 11/25 (44%) , Positives = 16/25 (64%) 

Query: 335 LPAASAGLKAGDVITKVGDTAVTSS 359 

LP ++G K GD+ K GDT +T + 
Sbjct: 53 LPEETSGTKEGDLSEKPGDTVLTQA 77 



> ref|NP 358044. l| EcoA type I restriction-modification enzyme R subunit 
[Streptococcus pneumoniae R6] 
Length = 777 

Score = 25.4 bits (54), Expect = 3.1 

Identities = 18/66 (27%), Positives = 30/66 (45%), Gaps = 1/66 (1%) 

Query: 119 D A YVVTN YHVI AGNS S LD VLL S GGQ KVKAS WG YDE YTDLA VL KI S S EHVKDVAT FADS S 178 

+ Y+VT+ V NS++ VL G+ + S+ Y L ++ + V AD 

Sbjct: 588 EKYIVTDKQVTILNSTVQVLDENGKLITESLTDYTRKNILGSYATLNDFI-TVWHTADKK 646 

Query: 179 KLTIGE 184 

• KL + E 
Sbjct: 647 KLILDE 652 



> ref |NP 359110. l| Penicillin-binding protein 2B [Streptococcus pneumoniae R6] 
Length = 685 

Score = 24.6 bits (52), Expect = 5.3 

Identities = 14/47 (29%), Positives = 19/47 (40%) 

Query: 190 GSPLGSQFANTATEGILSATSRQVTLTQENGQTTNINAIQTDAAINP 236 , 

G G F+N AIT + + Q TN NA+ + NP 
Sbjct: 602 GLTTGRAFSNGALVS I SGKTGTAESYVADGQQATNTNAVAYAPSDNP 648 



> ref |NP 358390 .1 1 6-phosphof ructokinase I [Streptococcus pneumoniae R6] 
Length = 335 

Score =24.3 bits (51), Expect =6.9 

Identities = 14/49 (28%), Positives = 24/49 (48%) 

Query: 246 IEGQVIGITQSKITTTEDGSTSVEGLGFAIPSNDWNIINKLEADGKIS 294 

I G +GI K+ T+ EG F++ + + + N EAD ++S 

Sbjct: 280 I GGVAVG I RNE KMVENP I LGT AEEGAL FSLT AEGKI WNNPHEAD I ELS 328 
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> ref |NP_ 359031. l[ Isochorismatase [Streptococcus pneumoniae R6] 
Length = 191 

Score = 24.3 bits (51), Expect = 6.9 

Identities = 13/31 (41%), Positives = 18/31 (58%), Gaps - 2/31 (6%) 

Query: 163 ISSEHVKDVATFADSSKLTIGEPAIAVGSPL 193 

IS ++ +D ADS KLT G PA A+ + 
Sbjct: 6 ISIDYTEDFV- -ADSGKLTAGAPAQAISDAI 34 



> ref |NP 359032. l| Transcriptional pleiotropic repressor [Streptococcus 
pneumoniae R6] 

Length = 262 

Score = 23.9 bits (50), Expect = 9.0 

Identities = 24/105 (22%), Positives = 40/105 (38%), Gaps = 22/105 (20%) 

Query: 213 VTLTQENGQTTNINAIQTDAAINPGNSGGALINIEGQVIGITQSKITTTEDGSTSVEGLG 272 

VT+ + + A+ GN G ++ IGIT+S I 
Sbjct: 176 VTMAWTLSYSELRAVSAILGELNGNEGQLTASVIADRIGITRSVI 221 

Query: 273 FAI PSNDWNI INKLEADGKI SRPALGIRMVDLSQLSTNDSSQLK 317 

VN + KLE+ G I +LG++ L L ++ ++K 
Sbjct: 222 VNALRKLESAGIIESRSLGMKGTYLKVLISDIFEEVK 258 



Database: Completed Streptococcus pneumoniae R6 

Posted date: Oct 1, 2003 10:43 PM 
Number of letters in database: 589,192 
Number of sequences in database: 2043 



Lambda K H 

0.308 0.128 0.338 

Gapped 

Lambda K H 

0.267 0.0410 0.140 



Matrix: BLOSUM62 

Gap Penalties: Existence: 11, Extension: 1 

Number of Hits to DB: 33,145 

Number of Sequences: 2043 

Number of extensions: 1151 

Number of successful extensions: 4 

Number of sequences better than 10.0: 1 

Number of HSP's better than 10.0 without gapping: 0 

Number of HSP's successfully gapped in prelim test: 1 

Number of HSP's that attempted gapping in prelim test: 4 

Number of HSP's gapped (non-prelim): 1 

length of query: 408 

length of database: 588,593 
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effective HSP length: 81 
effective length of query: 327 
effective length of database: 423,515 
effective search space: 138489405 
effective search space used: 138489405 
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ANNEX 7 

BLASTP 2.2.6 [Apr-09-2003] 

RID : 1065207829-10461-370696. BLASTQ3 



Query= 

(408 letters) 



Database: Completed Streptococcus pyogenes SSI-1 ; 



1,531,058 sequences; 495,743,110 total letters 



Taxonomy reports 



Score E 

Sequences producing significant alignments: (bits) Value 



ref 


NP 


803122 


1 


putative serine protease [Streptococcus py. . . 


311 


2e-86 


ref 


NP 


801300 


1 


putative alcohol dehydrogenase I [Streptoc... 


32 


0 


.030 


ref 


NP 


802185 


1 


putative phage-related tail protein [Strep... 


31 


0 


.067 


ref 


NP 


802392 


1 


putative tail protein, phage assocaited [S... 


29 


0 


.20 


ref 


NP 


801584 


1 


putative transcriptional pleiotropic repre. . . 


29 


0 


.26 


ref 


NP 


803013 


1 


putative transcriptional regulator [Strept... 


25 


2 


.8 


ref 


NP 


802639 


1 


conserved hypothetical protein [Streptococ . . . 


25 


4 


.8 


ref 


NP 


801305 


1 


SOS ribosomal protein L4 [Streptococcus py. . . 


25 


4 


.8 



Alignments 



> ref 1 NP 803122. lj putative serine protease [Streptococcus pyogenes SSI-1] 
Length = 407 

Score = 311 bits (797), Expect = 2e-86 

Identities = 163/288 (56%), Positives = 207/288 (71%), Gaps = 5/288 (1%) 

Query: 107 EGSGVI YKKSGGDA^^Syj^I AGN^ S LD VLLSGGQKVKAS VVG YDE YTjD^^LKI S S E 166 

EGSGVIY+K G lY^OTSiWl G +++L++ G KV +VG D Y+D§A\|+KISS+ 
Sbjct: 108 EGSGVIYRKDGNSgYg^ 167 

Query: 167 HVKD VAT FADS S KLTI GE P AI AVGS PLGS QF ANTATEGI LS AT S RQVTLTQENGQTTN I N 226 
+K VA FADS + KL +GE AIA+GSPLG+Q+AN+ T+GI+S+ SR VTL ENG+T + N 
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Sbjct : 


168 


KIKTVAEFADSTKLNVGEVAIAIGSPLGTQYANSVTQGIVSSLSRTVTLKNENGETVSTN 22 7 


Query: 


227 


AIQTDAAINPSNSdSLlNiEGQ^IGSf QSKITTTEDGST SVEGLGFAI PSNDWN 


282 




AIQTDAAINP6Ngtof^pEG@Jf?gJ^ SKI++T GS +VEG+GFAIPS DV+ 




Sbjct : 


228 


AIQTDAAINp|^^^^^||gSgSsSKISSTPTGSNGNSGAVEGIGFAIPSTDVIK 


287 


Query: 


283 


IINKLEADGKISRPALGIRMVDLSQLSTNDSSQLKLPXXXXXXXXXXXXXXXLPAASAGL 


342 






II +LE +G++ RPALGI MV+L+ LSTN SQ+ +P +P AS L 




Sbjct: 


288 


IIKQLETNGEVIRPALGISMVNLNDLSTNALSQINIPTSVTGGIWAEVKEGMP-ASGKL 


346 


Query: 


343 


KAGDVITKVGDTAVTSSTDLQSALYSHNINDTVKVTYYRDGKSNTADV 390 








DVIT++ V S +DLQS+LY H+INDT+KVT+YR AD+ 




Sbjct: 


347 


AQYDVITEIDGKTVNSISDLQSSLYGHDINDTIKVTFYRGTTKKKADI 394 





> ref|NP 801300. l[ putative alcohol dehydrogenase I [Streptococcus pyogenes SSI- 
1] 

Length = 282 
Score = 32.0 bits (71), Expect = 0.030 

Identities = 21/71 (29%), Positives = 37/71 (52%), Gaps = 3/71 (4%) 

Query: 110 GVIYKKSGG-DAYWTNYHVIAGNSSLDVLLSGGQKVKASVVGYDEYTDLAVLKISSEHV 168 

G I +K+GG WT +A N ++D + +GG V + EY +L+++K + + 
Sbjct: 166 GYIQEKTGGAHGVWTAVS KVAFNQAIDSVRAGGTWAVGLP - - SEYMELS IVKTVLDGI 223 

Query: 169 KDVATFADSSK 179 

K V + + K 
Sbjct: 224 KWGSLVGTRK 234 



> ref |NP 802185. l| putative phage-related tail protein [Streptococcus pyogenes 
ssi-13 

Length =1307 
Score = 30.8 bits (68), Expect = 0.067 

Identities = 48/213 (22%) , Positives = 86/213 (40%), Gaps = 30/213 (14%) 

Query: 153 DEYTDLAVLKISSEHVKDVATFADSSKLTIGEPAIAVGSPLGSQFANTATEGILSATSRQ 212 

DE ++ K+S + ++ +A +S + I A A G T ILS + 

Sbjct: 192 DETATVSYAKLS -QGIRQMAKELPASAVEIAHVAEAAGQ LGVKTGDILSFSRTM 244 

Query: 213 VTLTQENGQTTNINAIQTDAAINPGNSGGALINIEGQVIGITQSKITTTEDGSTSVEGLG 272 

+ L G++TN++A + +1 + NI G + S+ + ++ G 

Sbjct: 245 IDL GESTNLSAEEAATS I AKIANITG LASSEYSRFGSAWAL-GNN 289 

Query: 273 FAIPSNDVWIINKLEADGKISRPALGIRMVDLSQLSTNDSSQLKLPXXXXXXXXXXXXX 332 

FA D+V + N++ A GK+ + G+ ++ L+T SS + + 
Sbjct: 2 90 FATTEKDIVAMTNRIAASGKLA GLTNQEMLALATAMSS -VGIEAEAGGTAMTQSLS 344 

Query: 333 XXLPAASAGLKAGDVITKVGDTAVTSSTDLQSA 365 

A ++G GD + K A SS D A 
Sbjct: 345 AIERAVASG GDNLNKFAQIANMSSADFARA 374 
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> ref|NP 802392.1 | putative tail protein, phage assocaited [Streptococcus 
pyogenes 

SSI-1] 
Length = 1372 

Score = 29.3 bits (64), Expect = 0.20 

Identities = 34/146 (23%), Positives = 57/146 (39%), Gaps = 19/146 (13%) 

Query: 220 GQTTNINAIQTDAAINPGNSGGALINIEGQVIGITQSKITTTEDGSTSVEGLGFAIPSND 279 

GQ+TN++A + ++I + NI G SK + S G F+ D 

Sbjct: 254 GQSTNLSAEEAASS I AKIANITGLT SKEYSRFGSSWALGNNFSTTERD 302 

Query: 280 VWIINKLEADGKISRPALGIRMVDLSQLSTNDSSQLKLPXXXXXXXXXXXXXXXLPAAS 339 

V+ + N++ A GK++ G+ ++ L+T SS + 
Sbjct: 303 VIAMTNRIAASGKLA GLTNQEMLALATAMSS VGI EAEAGGTAMTQTLS AI ET 354 

Query: 340 AGLKAGDVITKVGDTAVTSSTDLQSA 365 

A + G+ +TK A SS D A 
Sbjct: 355 AVINGGEDLTKFAQ I ANMS S KDFAKA 380 



> ref[NP 801584. l| putative transcriptional pleiotropic repressor [Streptococcus 
pyogenes SSI-1] 
Length = 260 

Score = 28.9 bits (63), Expect - 0.26 

Identities = 34/128 (26%), Positives = 52/128 (40%), Gaps = 26/128 (20%) 

Query: 193 LGSQFANTATEGILSATSRQVTLTQENGQTTNINAIQTDAAIN PGNSGGALINIEGQ 249 

+G Q N TE L T R+ T T + + ++ AAI GN G + + 

Sbjct: 154 VGIQLLNLQTEN-LEDTIRKQTAVNMAINTLSYSEMKAVAAILGELDGNEGRLTASVIAD 212 

Query: 250 VIGITQSKITTTEDGSTSVEGLGFAIPSNDWNIINKLEADGKISRPALGIRMVDLSQLS 309 

IGIT+S I VN + KLE+ G I +LG++ L ++ 

Sbjct: 213 RIGITRSVI VNALRKLE S AG 1 1 ES RS LGM KGT YL KVI N 250 

Query: 310 TNDSSQLK 317 
++LK 

Sbjct: 251 EGIFAKLK 258 



> ref|NP 803013. l[ putative transcriptional regulator [Streptococcus pyogenes 
SSI-1] 

Length = 326 
Score = 25.4 bits (54), Expect = 2.8 

Identities = 21/79 (26%), Positives = 36/79 (45%), Gaps = 7/79 (8%) 

Query: 112 I YKKSGGDA YWTNYHVI AGNS S LDVLLSGGQK - VKAS WG YDE YTDLAVLKI S SEHVKD 17 0 

IY +GG +++ YHV L + G + A V+ D++ +L+ S++ D 

Sbjct: 138 IYPLAGGPSHINAKYHVNTLVYRLARIFHGNSAFMNAMVIQEDKHLAKGILQ--SKYFND 195 

Query: 171 VAT FADS SKLT I GEP 185 

+ T D L + GEP 
Sbjct: 196 ILTSWDQLDLALVGIGGEP 214 
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> ref[NP 802639. l| conserved hypothetical protein [Streptococcus pyogenes SSI-1] 
Length =574 

Score =24.6 bits (52), Expect =4.8 

Identities = 10/32 (31%), Positives =20/32 (62%) 

Query: 345 GDVI TKVGDTAVTS STDLQSAL YSHNINDTVK 376 

G +1 K D+ +TS + + AL++ +ND ++ 
Sbjct: 23 GVIIRKRNDSLITSLEERKQALFALPVNDEIE 54 



> ref |NP 801305. l| 50S ribosomal protein L4 [Streptococcus pyogenes SSI-1] 
Length =207 

Score =24.6 bits (52), Expect =4.8 

Identities = 12/33 (36%), Positives = 18/33 (54%), Gaps = 1/33 (3%) 

Query: 262 EDGSTSVEGLGFAIPSN-DWNIINKLEADGKI 293 

ED +VEGL FA P + +++ L D K+ 
Sbjct: 120 EDKFVAVEGLSFAAPKTAEFAKVLSALSIDTKV 152 



Database: Completed Streptococcus pyogenes SSI-1 

Posted date: Oct 1, 2003 10:43 PM 
Number of letters in database: 534,258 
Number of sequences in database: 1861 

Lambda K H 

0.308 0.128 0.338 

Gapped 

Lambda K H 

0.267 0.0410 0.140 



Matrix: BLOSUM62 

Gap Penalties: Existence: 11, Extension: 1 

Number of Hits to DB: 32,970 

Number of Sequences: 1861 

Number of extensions: 1177 

Number of successful extensions: 3 

Number of sequences better than 10.0: 1 

Number of HSP's better than 10.0 without gapping: 0 

Number of HSP's successfully gapped in prelim test: 1 

Number of HSP's that attempted gapping in prelim test: 3 

Number of HSP's gapped (non-prelim): 1 

length of query: 4 08 

length of database: 532,687 

effective HSP length: 80 

effective length of query: 328 

effective length of database: 384,447 

effective search space: 126098616 

effective search space used: 126098616 
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T: 11 
A: 40 

XI: 16 ( 7.1 bits) 

X2: 38 (14.6 bits) 

X3: 64 (24.7 bits) 

SI: 42 (21.7 bits) 

S2: 50 (23.9 bits) 
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